@vtucode - in 21CS732 Module 2 Textbook
@vtucode - in 21CS732 Module 2 Textbook
3 Intensity Transformations
and Spatial Filtering
It makes all the difference whether one sees darkness
through the light or brightness through the shadows.
David Lindsay
Preview
The term spatial domain refers to the image plane itself, and image process-
ing methods in this category are based on direct manipulation of pixels in
an image. This is in contrast to image processing in a transform domain
which, as introduced in Section 2.6.7 and discussed in more detail in
Chapter 4, involves first transforming an image into the transform domain,
doing the processing there, and obtaining the inverse transform to bring the
results back into the spatial domain. Two principal categories of spatial pro-
cessing are intensity transformations and spatial filtering. As you will learn
in this chapter, intensity transformations operate on single pixels of an
image, principally for the purpose of contrast manipulation and image
thresholding. Spatial filtering deals with performing operations, such as
image sharpening, by working in a neighborhood of every pixel in an image.
In the sections that follow, we discuss a number of “classical” techniques for
intensity transformations and spatial filtering. We also discuss in some de-
tail fuzzy techniques that allow us to incorporate imprecise, knowledge-
based information in the formulation of intensity transformations and
spatial filtering algorithms.
104
3.2 ■ Some Basic Intensity Transformation Functions 107
higher contrast than the original by darkening the intensity levels below k
and brightening the levels above k. In this technique, sometimes called
contrast stretching (see Section 3.2.4), values of r lower than k are com-
pressed by the transformation function into a narrow range of s, toward
black. The opposite is true for values of r higher than k. Observe how an in-
tensity value r0 is mapped to obtain the corresponding value s0. In the limit-
ing case shown in Fig. 3.2(b), T(r) produces a two-level (binary) image. A
mapping of this form is called a thresholding function. Some fairly simple, yet
powerful, processing approaches can be formulated with intensity transfor-
mation functions. In this chapter, we use intensity transformations principally
for image enhancement. In Chapter 10, we use them for image segmentation.
Approaches whose results depend only on the intensity at a point sometimes
are called point processing techniques, as opposed to the neighborhood pro-
cessing techniques discussed earlier in this section.
s = L - 1 - r (3.2-1)
Log
nth power
L/2
L/4
0
0 L/4 L/2 3L/4 L1
Input intensity level, r
3.2 ■ Some Basic Intensity Transformation Functions 109
a b
FIGURE 3.4
(a) Original digital
mammogram.
(b) Negative
image obtained
using the negative
transformation
in Eq. (3.2-1).
(Courtesy of G.E.
Medical Systems.)
image, especially when the black areas are dominant in size. Figure 3.4
shows an example. The original image is a digital mammogram showing a
small lesion. In spite of the fact that the visual content is the same in both
images, note how much easier it is to analyze the breast tissue in the nega-
tive image in this particular case.
a b
FIGURE 3.5
(a) Fourier
spectrum.
(b) Result of
applying the log
transformation in
Eq. (3.2-2) with
c = 1.
faithfully such a wide range of intensity values. The net effect is that a signifi-
cant degree of intensity detail can be lost in the display of a typical Fourier
spectrum.
As an illustration of log transformations, Fig. 3.5(a) shows a Fourier spec-
trum with values in the range 0 to 1.5 * 106. When these values are scaled lin-
early for display in an 8-bit system, the brightest pixels will dominate the
display, at the expense of lower (and just as important) values of the spec-
trum. The effect of this dominance is illustrated vividly by the relatively small
area of the image in Fig. 3.5(a) that is not perceived as black. If, instead of dis-
playing the values in this manner, we first apply Eq. (3.2-2) (with c = 1 in this
case) to the spectrum values, then the range of values of the result becomes 0
to 6.2, which is more manageable. Figure 3.5(b) shows the result of scaling this
new range linearly and displaying the spectrum in the same 8-bit display. The
wealth of detail visible in this image as compared to an unmodified display of
the spectrum is evident from these pictures. Most of the Fourier spectra seen
in image processing publications have been scaled in just this manner.
s = c rg (3.2-3)
in the range
g 0.40
shown.
g 0.67
L/2 g1
g 1.5
g 2.5
L/4 g 5.0
g 10.0
g 25.0
0
0 L/4 L/ 2 3L/4 L1
Input intensity level, r
a b
c d
FIGURE 3.7
(a) Intensity ramp
image. (b) Image
as viewed on a
simulated monitor
with a gamma of
2.5. (c) Gamma-
corrected image.
(d) Corrected
image as viewed
on the same Original image Gamma Original image as viewed
monitor. Compare correction on monitor
(d) and (a).
■ In addition to gamma correction, power-law transformations are useful for EXAMPLE 3.1:
general-purpose contrast manipulation. Figure 3.8(a) shows a magnetic reso- Contrast
enhancement
nance image (MRI) of an upper thoracic human spine with a fracture disloca-
using power-law
tion and spinal cord impingement. The fracture is visible near the vertical transformations.
center of the spine, approximately one-fourth of the way down from the top of
the picture. Because the given image is predominantly dark, an expansion of
intensity levels is desirable. This can be accomplished with a power-law trans-
formation with a fractional exponent. The other images shown in the figure
were obtained by processing Fig. 3.8(a) with the power-law transformation
a b
c d
FIGURE 3.8
(a) Magnetic
resonance
image (MRI) of a
fractured human
spine.
(b)–(d) Results of
applying the
transformation in
Eq. (3.2-3) with
c = 1 and
g = 0.6, 0.4, and
0.3, respectively.
(Original image
courtesy of Dr.
David R. Pickens,
Department of
Radiology and
Radiological
Sciences,
Vanderbilt
University
Medical Center.)
114 Chapter 3 ■ Intensity Transformations and Spatial Filtering
EXAMPLE 3.2: ■ Figure 3.9(a) shows the opposite problem of Fig. 3.8(a). The image to be
Another processed now has a washed-out appearance, indicating that a compression
illustration of
of intensity levels is desirable. This can be accomplished with Eq. (3.2-3)
power-law
transformations. using values of g greater than 1. The results of processing Fig. 3.9(a) with
g = 3.0, 4.0, and 5.0 are shown in Figs. 3.9(b) through (d). Suitable results
were obtained with gamma values of 3.0 and 4.0, the latter having a slightly
a b
c d
FIGURE 3.9
(a) Aerial image.
(b)–(d) Results of
applying the
transformation in
Eq. (3.2-3) with
c = 1 and
g = 3.0, 4.0, and
5.0, respectively.
(Original image
for this example
courtesy of
NASA.)
3.2 ■ Some Basic Intensity Transformation Functions 115
more appealing appearance because it has higher contrast. The result obtained
with g = 5.0 has areas that are too dark, in which some detail is lost. The dark
region to the left of the main road in the upper left quadrant is an example of
such an area. ■
Contrast stretching
One of the simplest piecewise linear functions is a contrast-stretching trans-
formation. Low-contrast images can result from poor illumination, lack of dy-
namic range in the imaging sensor, or even the wrong setting of a lens aperture
during image acquisition. Contrast stretching is a process that expands the
range of intensity levels in an image so that it spans the full intensity range of
the recording medium or display device.
Figure 3.10(a) shows a typical transformation used for contrast stretching. The
locations of points (r1, s1) and (r2, s2) control the shape of the transformation func-
tion. If r1 = s1 and r2 = s2, the transformation is a linear function that produces no
changes in intensity levels. If r1 = r2, s1 = 0 and s2 = L - 1, the transformation
becomes a thresholding function that creates a binary image, as illustrated in
Fig. 3.2(b). Intermediate values of (r1, s1) and (r2, s2) produce various degrees of
spread in the intensity levels of the output image, thus affecting its contrast. In gen-
eral, r1 … r2 and s1 … s2 is assumed so that the function is single valued and mo-
notonically increasing. This condition preserves the order of intensity levels, thus
preventing the creation of intensity artifacts in the processed image.
Figure 3.10(b) shows an 8-bit image with low contrast. Figure 3.10(c) shows
the result of contrast stretching, obtained by setting (r1, s1) = (rmin, 0) and
(r2, s2) = (rmax, L - 1), where rmin and rmax denote the minimum and maxi-
mum intensity levels in the image, respectively. Thus, the transformation func-
tion stretched the levels linearly from their original range to the full range
[0, L - 1]. Finally, Fig. 3.10(d) shows the result of using the thresholding func-
tion defined previously, with (r1, s1) = (m, 0) and (r2, s2) = (m, L - 1),
where m is the mean intensity level in the image. The original image on which
these results are based is a scanning electron microscope image of pollen, mag-
nified approximately 700 times.
Intensity-level slicing
Highlighting a specific range of intensities in an image often is of interest.Appli-
cations include enhancing features such as masses of water in satellite imagery
and enhancing flaws in X-ray images. The process, often called intensity-level
116 Chapter 3 ■ Intensity Transformations and Spatial Filtering
a b
c d
L1
FIGURE 3.10
slicing, can be implemented in several ways, but most are variations of two basic
themes. One approach is to display in one value (say, white) all the values in the
range of interest and in another (say, black) all other intensities. This transfor-
mation, shown in Fig. 3.11(a), produces a binary image. The second approach,
based on the transformation in Fig. 3.11(b), brightens (or darkens) the desired
range of intensities but leaves all other intensity levels in the image unchanged.
a b L1 L1
■ Figure 3.12(a) is an aortic angiogram near the kidney area (see Section EXAMPLE 3.3:
1.3.2 for a more detailed explanation of this image). The objective of this ex- Intensity-level
slicing.
ample is to use intensity-level slicing to highlight the major blood vessels that
appear brighter as a result of an injected contrast medium. Figure 3.12(b)
shows the result of using a transformation of the form in Fig. 3.11(a), with the
selected band near the top of the scale, because the range of interest is brighter
than the background. The net result of this transformation is that the blood
vessel and parts of the kidneys appear white, while all other intensities are
black. This type of enhancement produces a binary image and is useful for
studying the shape of the flow of the contrast medium (to detect blockages, for
example).
If, on the other hand, interest lies in the actual intensity values of the region
of interest, we can use the transformation in Fig. 3.11(b). Figure 3.12(c) shows
the result of using such a transformation in which a band of intensities in the
mid-gray region around the mean intensity was set to black, while all other in-
tensities were left unchanged. Here, we see that the gray-level tonality of the
major blood vessels and part of the kidney area were left intact. Such a result
might be useful when interest lies in measuring the actual flow of the contrast
medium as a function of time in a series of images. ■
Bit-plane slicing
Pixels are digital numbers composed of bits. For example, the intensity of each
pixel in a 256-level gray-scale image is composed of 8 bits (i.e., one byte). In-
stead of highlighting intensity-level ranges, we could highlight the contribution
a b c
FIGURE 3.12 (a) Aortic angiogram. (b) Result of using a slicing transformation of the type illustrated in Fig.
3.11(a), with the range of intensities of interest selected in the upper end of the gray scale. (c) Result of
using the transformation in Fig. 3.11(b), with the selected area set to black, so that grays in the area of the
blood vessels and kidneys were preserved. (Original image courtesy of Dr. Thomas R. Gest, University of
Michigan Medical School.)
118 Chapter 3 ■ Intensity Transformations and Spatial Filtering
Bit plane 1
(least significant)
made to total image appearance by specific bits. As Fig. 3.13 illustrates, an 8-bit
image may be considered as being composed of eight 1-bit planes, with plane 1
containing the lowest-order bit of all pixels in the image and plane 8 all the
highest-order bits.
Figure 3.14(a) shows an 8-bit gray-scale image and Figs. 3.14(b) through (i)
are its eight 1-bit planes, with Fig. 3.14(b) corresponding to the lowest-order bit.
Observe that the four higher-order bit planes, especially the last two, contain a
significant amount of the visually significant data. The lower-order planes con-
tribute to more subtle intensity details in the image. The original image has a
gray border whose intensity is 194. Notice that the corresponding borders of some
of the bit planes are black (0), while others are white (1). To see why, consider a
a b c
d e f
g h i
FIGURE 3.14 (a) An 8-bit gray-scale image of size 500 * 1192 pixels. (b) through (i) Bit planes 1 through 8,
with bit plane 1 corresponding to the least significant bit. Each bit plane is a binary image.
3.2 ■ Some Basic Intensity Transformation Functions 119
pixel in, say, the middle of the lower border of Fig. 3.14(a). The corresponding
pixels in the bit planes, starting with the highest-order plane, have values 1 1 0 0
0 0 1 0, which is the binary representation of decimal 194. The value of any pixel
in the original image can be similarly reconstructed from its corresponding
binary-valued pixels in the bit planes.
In terms of intensity transformation functions, it is not difficult to show that
the binary image for the 8th bit plane of an 8-bit image can be obtained by
processing the input image with a thresholding intensity transformation func-
tion that maps all intensities between 0 and 127 to 0 and maps all levels be-
tween 128 and 255 to 1. The binary image in Fig. 3.14(i) was obtained in just
this manner. It is left as an exercise (Problem 3.4) to obtain the intensity trans-
formation functions for generating the other bit planes.
Decomposing an image into its bit planes is useful for analyzing the rela-
tive importance of each bit in the image, a process that aids in determining
the adequacy of the number of bits used to quantize the image. Also, this type
of decomposition is useful for image compression (the topic of Chapter 8), in
which fewer than all planes are used in reconstructing an image. For example,
Fig. 3.15(a) shows an image reconstructed using bit planes 8 and 7. The recon-
struction is done by multiplying the pixels of the nth plane by the constant
2n - 1. This is nothing more than converting the nth significant binary bit to
decimal. Each plane used is multiplied by the corresponding constant, and all
planes used are added to obtain the gray scale image. Thus, to obtain
Fig. 3.15(a), we multiplied bit plane 8 by 128, bit plane 7 by 64, and added the
two planes. Although the main features of the original image were restored,
the reconstructed image appears flat, especially in the background. This is not
surprising because two planes can produce only four distinct intensity levels.
Adding plane 6 to the reconstruction helped the situation, as Fig. 3.15(b)
shows. Note that the background of this image has perceptible false contour-
ing. This effect is reduced significantly by adding the 5th plane to the recon-
struction, as Fig. 3.15(c) illustrates. Using more planes in the reconstruction
would not contribute significantly to the appearance of this image. Thus, we
conclude that storing the four highest-order bit planes would allow us to re-
construct the original image in acceptable detail. Storing these four planes in-
stead of the original image requires 50% less storage (ignoring memory
architecture issues).
a b c
FIGURE 3.15 Images reconstructed using (a) bit planes 8 and 7; (b) bit planes 8, 7, and 6; and (c) bit planes 8,
7, 6, and 5. Compare (c) with Fig. 3.14(a).
120 Chapter 3 ■ Intensity Transformations and Spatial Filtering
FIGURE 3.16 Four basic image types: dark, light, low contrast, high
contrast, and their corresponding histograms.
122 Chapter 3 ■ Intensity Transformations and Spatial Filtering
a b T(r) T (r)
FIGURE 3.17
(a) Monotonically L1 L1
increasing
Single
function, showing value, sk
how multiple T (r)
T(r)
values can map to
a single value. Single sk
(b) Strictly value, sq
monotonically
increasing ...
function. This is a
one-to-one
r r
mapping, both 0 rk L1
ways. 0 Multiple Single L 1
values value
†
Recall that a function T(r) is monotonically increasing if T(r2) Ú T(r1) for r2 7 r1. T(r) is a strictly mo-
notonically increasing function if T(r2) 7 T(r1) for r2 7 r1. Similar definitions apply to monotonically
decreasing functions.
3.3 ■ Histogram Processing 123
that satisfies conditions (a) and (b). Here, we see that it is possible for multi-
ple values to map to a single value and still satisfy these two conditions. That
is, a monotonic transformation function performs a one-to-one or many-to-
one mapping. This is perfectly fine when mapping from r to s. However,
Fig. 3.17(a) presents a problem if we wanted to recover the values of r unique-
ly from the mapped values (inverse mapping can be visualized by reversing
the direction of the arrows). This would be possible for the inverse mapping
of sk in Fig. 3.17(a), but the inverse mapping of sq is a range of values, which,
of course, prevents us in general from recovering the original value of r that
resulted in sq. As Fig. 3.17(b) shows, requiring that T(r) be strictly monotonic
guarantees that the inverse mappings will be single valued (i.e., the mapping
is one-to-one in both directions). This is a theoretical requirement that allows
us to derive some important histogram processing techniques later in this
chapter. Because in practice we deal with integer intensity values, we are
forced to round all results to their nearest integer values. Therefore, when
strict monotonicity is not satisfied, we address the problem of a nonunique in-
verse transformation by looking for the closest integer matches. Example 3.8
gives an illustration of this.
The intensity levels in an image may be viewed as random variables in the
interval [0, L - 1]. A fundamental descriptor of a random variable is its prob-
ability density function (PDF). Let pr (r) and ps (s) denote the PDFs of r and s,
respectively, where the subscripts on p are used to indicate that pr and ps are
different functions in general. A fundamental result from basic probability
theory is that if pr(r) and T(r) are known, and T(r) is continuous and differen-
tiable over the range of values of interest, then the PDF of the transformed
(mapped) variable s can be obtained using the simple formula
ps (s) = pr(r) ` `
dr
(3.3-3)
ds
Thus, we see that the PDF of the output intensity variable, s, is determined by
the PDF of the input intensities and the transformation function used [recall
that r and s are related by T(r)].
A transformation function of particular importance in image processing has
the form
r
s = T(r) = (L - 1) pr (w) dw (3.3-4)
L0
where w is a dummy variable of integration. The right side of this equation is
recognized as the cumulative distribution function (CDF) of random variable
r. Because PDFs always are positive, and recalling that the integral of a func-
tion is the area under the function, it follows that the transformation function
of Eq. (3.3-4) satisfies condition (a) because the area under the function can-
not decrease as r increases. When the upper limit in this equation is
r = (L - 1), the integral evaluates to 1 (the area under a PDF curve always
is 1), so the maximum value of s is (L - 1) and condition (b) is satisfied also.
124 Chapter 3 ■ Intensity Transformations and Spatial Filtering
ds dT(r)
=
dr dr
r
d
= (L - 1) B p (w) dw R (3.3-5)
dr L0 r
= (L - 1)pr(r)
Substituting this result for dr> ds in Eq. (3.3-3), and keeping in mind that all
probability values are positive, yields
ps (s) = pr (r) ` `
dr
ds
= pr(r) ` `
1
(3.3-6)
(L - 1)pr (r)
1
= 0 … s … L - 1
L - 1
We recognize the form of ps (s) in the last line of this equation as a uniform
probability density function. Simply stated, we have demonstrated that per-
forming the intensity transformation in Eq. (3.3-4) yields a random variable, s,
characterized by a uniform PDF. It is important to note from this equation that
T(r) depends on pr (r) but, as Eq. (3.3-6) shows, the resulting ps (s) always is
uniform, independently of the form of pr(r). Figure 3.18 illustrates these
concepts.
pr (r) ps (s)
A
Eq. (3.3-4)
1
L1
r s
0 L1 0 L1
a b
FIGURE 3.18 (a) An arbitrary PDF. (b) Result of applying the transformation in
Eq. (3.3-4) to all intensity levels, r. The resulting intensities, s, have a uniform PDF,
independently of the form of the PDF of the r’s.
3.3 ■ Histogram Processing 125
■ To fix ideas, consider the following simple example. Suppose that the (con- EXAMPLE 3.4:
tinuous) intensity values in an image have the PDF Illustration of
Eqs. (3.3-4) and
2r (3.3-6).
for 0 … r … L - 1
pr(r) = c (L - 1)2
0 otherwise
From Eq. (3.3-4),
r r
2 r2
s = T(r) = (L - 1) pr (w) dw = w dw =
L0 L - 1 L0 L - 1
Suppose next that we form a new image with intensities, s, obtained using
this transformation; that is, the s values are formed by squaring the corre-
sponding intensity values of the input image and dividing them by (L - 1).
For example, consider an image in which L = 10, and suppose that a pixel
in an arbitrary location (x, y) in the input image has intensity r = 3. Then
the pixel in that location in the new image is s = T(r) = r 2>9 = 1. We can
verify that the PDF of the intensities in the new image is uniform simply by
substituting pr(r) into Eq. (3.3-6) and using the fact that s = r 2>(L - 1);
that is,
-1
ps(s) = pr(r) ` ` = `B R `
dr 2r ds
ds 2 dr
(L - 1)
` `
-1
2r d r2
= B R
(L - 1)2 dr L - 1
2 ` ` =
2r (L - 1) 1
=
(L - 1) 2 r L - 1
where the last step follows from the fact that r is nonnegative and we assume
that L 7 1. As expected, the result is a uniform PDF. ■
For discrete values, we deal with probabilities (histogram values) and sum-
mations instead of probability density functions and integrals.† As mentioned
earlier, the probability of occurrence of intensity level rk in a digital image is
approximated by
nk
pr(rk) = k = 0, 1, 2, Á , L - 1 (3.3-7)
MN
where MN is the total number of pixels in the image, nk is the number of pix-
els that have intensity rk, and L is the number of possible intensity levels in the
image (e.g., 256 for an 8-bit image). As noted in the beginning of this section, a
plot of pr(rk) versus rk is commonly referred to as a histogram.
†
The conditions of monotonicity stated earlier apply also in the discrete case. We simply restrict the val-
ues of the variables to be discrete.
126 Chapter 3 ■ Intensity Transformations and Spatial Filtering
EXAMPLE 3.5: ■ Before continuing, it will be helpful to work through a simple example.
A simple Suppose that a 3-bit image (L = 8) of size 64 * 64 pixels (MN = 4096) has
illustration of
the intensity distribution shown in Table 3.1, where the intensity levels are in-
histogram
equalization. tegers in the range [0, L - 1] = [0, 7].
The histogram of our hypothetical image is sketched in Fig. 3.19(a). Values
of the histogram equalization transformation function are obtained using
Eq. (3.3-8). For instance,
0
s0 = T(r0) = 7 a pr(rj) = 7pr (r0) = 1.33
j=0
Similarly,
1
s1 = T(r1) = 7 a pr (rj) = 7pr (r0) + 7pr(r1) = 3.08
j=0
pr(rk) = nk>MN
TABLE 3.1
rk nk
Intensity
distribution and r0 = 0 790 0.19
histogram values r1 = 1 1023 0.25
for a 3-bit, r2 = 2 850 0.21
64 * 64 digital r3 = 3 656 0.16
image. r4 = 4 329 0.08
r5 = 5 245 0.06
r6 = 6 122 0.03
r7 = 7 81 0.02
3.3 ■ Histogram Processing 127
pr (rk) sk ps (sk)
a b c
FIGURE 3.19 Illustration of histogram equalization of a 3-bit (8 intensity levels) image. (a) Original
histogram. (b) Transformation function. (c) Equalized histogram.
At this point, the s values still have fractions because they were generated
by summing probability values, so we round them to the nearest integer:
s0 = 1.33 : 1 s4 = 6.23 : 6
s1 = 3.08 : 3 s5 = 6.65 : 7
s2 = 4.55 : 5 s6 = 6.86 : 7
s3 = 5.67 : 6 s7 = 7.00 : 7
These are the values of the equalized histogram. Observe that there are only
five distinct intensity levels. Because r0 = 0 was mapped to s0 = 1, there are
790 pixels in the histogram equalized image with this value (see Table 3.1).
Also, there are in this image 1023 pixels with a value of s1 = 3 and 850 pixels
with a value of s2 = 5. However both r3 and r4 were mapped to the same
value, 6, so there are (656 + 329) = 985 pixels in the equalized image with this
value. Similarly, there are (245 + 122 + 81) = 448 pixels with a value of 7 in
the histogram equalized image. Dividing these numbers by MN = 4096 yielded
the equalized histogram in Fig. 3.19(c).
Because a histogram is an approximation to a PDF, and no new allowed in-
tensity levels are created in the process, perfectly flat histograms are rare in
practical applications of histogram equalization. Thus, unlike its continuous
counterpart, it cannot be proved (in general) that discrete histogram equaliza-
tion results in a uniform histogram. However, as you will see shortly, using Eq.
(3.3-8) has the general tendency to spread the histogram of the input image so
that the intensity levels of the equalized image span a wider range of the in-
tensity scale. The net result is contrast enhancement. ■
need for further parameter specifications. We note also the simplicity of the
computations required to implement the technique.
The inverse transformation from s back to r is denoted by
rk = T-1(sk) k = 0, 1, 2, Á , L - 1 (3.3-9)
It can be shown (Problem 3.10) that this inverse transformation satisfies con-
ditions (a¿) and (b) only if none of the levels, rk, k = 0, 1, 2, Á , L - 1, are
missing from the input image, which in turn means that none of the components
of the image histogram are zero. Although the inverse transformation is not
used in histogram equalization, it plays a central role in the histogram-matching
scheme developed in the next section.
EXAMPLE 3.6: ■ The left column in Fig. 3.20 shows the four images from Fig. 3.16, and the
Histogram center column shows the result of performing histogram equalization on each
equalization.
of these images. The first three results from top to bottom show significant im-
provement. As expected, histogram equalization did not have much effect on
the fourth image because the intensities of this image already span the full in-
tensity scale. Figure 3.21 shows the transformation functions used to generate the
equalized images in Fig. 3.20. These functions were generated using Eq. (3.3-8).
Observe that transformation (4) has a nearly linear shape, indicating that the
inputs were mapped to nearly equal outputs.
The third column in Fig. 3.20 shows the histograms of the equalized images. It
is of interest to note that, while all these histograms are different, the histogram-
equalized images themselves are visually very similar.This is not unexpected be-
cause the basic difference between the images on the left column is one of
contrast, not content. In other words, because the images have the same con-
tent, the increase in contrast resulting from histogram equalization was
enough to render any intensity differences in the equalized images visually in-
distinguishable. Given the significant contrast differences between the original
images, this example illustrates the power of histogram equalization as an
adaptive contrast enhancement tool. ■
FIGURE 3.20 Left column: images from Fig. 3.16. Center column: corresponding histogram-
equalized images. Right column: histograms of the images in the center column.
130 Chapter 3 ■ Intensity Transformations and Spatial Filtering
The transformation T(r) can be obtained from Eq. (3.3-10) once pr(r) has
been estimated from the input image. Similarly, the transformation function
G(z) can be obtained using Eq. (3.3-11) because pz(z) is given.
Equations (3.3-10) through (3.3-12) show that an image whose intensity
levels have a specified probability density function can be obtained from a
given image by using the following procedure:
1. Obtain pr(r) from the input image and use Eq. (3.3-10) to obtain the val-
ues of s.
2. Use the specified PDF in Eq. (3.3-11) to obtain the transformation function
G(z).
3.3 ■ Histogram Processing 131
■ Assuming continuous intensity values, suppose that an image has the inten- EXAMPLE 3.7:
sity PDF pr(r) = 2 r>(L - 1)2 for 0 … r … (L - 1) and pr(r) = 0 for other Histogram
specification.
values of r. Find the transformation function that will produce an image whose
intensity PDF is pz (z) = 3z2>(L - 1)3 for 0 … z … (L - 1) and pz (z) = 0 for
other values of z.
First, we find the histogram equalization transformation for the interval
[0, L - 1]:
r r
2 r2
s = T(r) = (L - 1) pr(w) dw = w dw =
L0 (L - 1) L0 (L - 1)
By definition, this transformation is 0 for values outside the range [0, L - 1].
Squaring the values of the input intensities and dividing them by (L - 1)2 will
produce an image whose intensities, s, have a uniform PDF because this is a
histogram-equalization transformation, as discussed earlier.
We are interested in an image with a specified histogram, so we find next
z z
3 z3
G(z) = (L - 1) pz (w) dw = w2 dw =
L0 (L - 1) L0
2
(L - 1)2
over the interval [0, L - 1]; this function is 0 elsewhere by definition. Finally,
we require that G(z) = s, but G(z) = z3>(L - 1)2; so z3>(L - 1)2 = s, and
we have
z = C (L - 1)2s D
1>3
So, if we multiply every histogram equalized pixel by (L - 1)2 and raise the
product to the power 1>3, the result will be an image whose intensities, z, have
the PDF pz(z) = 3z2>(L - 1)3 in the interval [0, L - 1], as desired.
Because s = r2>(L - 1) we can generate the z’s directly from the intensi-
ties, r, of the input image:
1/3
z = C (L - 1) s D = C (L - 1)r 2 D
1/3 r2 1/3
2
= B (L - 1)2 R
(L - 1)
Thus, squaring the value of each pixel in the original image, multiplying the re-
sult by (L - 1), and raising the product to the power 1>3 will yield an image
132 Chapter 3 ■ Intensity Transformations and Spatial Filtering
whose intensity levels, z, have the specified PDF. We see that the intermedi-
ate step of equalizing the input image can be skipped; all we need is to obtain
the transformation function T(r) that maps r to s. Then, the two steps can be
combined into a single transformation from r to z. ■
As the preceding example shows, histogram specification is straightforward
in principle. In practice, a common difficulty is finding meaningful analytical
expressions for T(r) and G-1. Fortunately, the problem is simplified signifi-
cantly when dealing with discrete quantities. The price paid is the same as for
histogram equalization, where only an approximation to the desired histogram
is achievable. In spite of this, however, some very useful results can be ob-
tained, even with crude approximations.
The discrete formulation of Eq. (3.3-10) is the histogram equalization trans-
formation in Eq. (3.3-8), which we repeat here for convenience:
k
sk = T(rk) = (L - 1) a pr (rj)
j=0
(3.3-13)
k
(L - 1)
MN ja
= nj k = 0, 1, 2, Á , L - 1
=0
where, as before, MN is the total number of pixels in the image, nj is the num-
ber of pixels that have intensity value rj, and L is the total number of possible
intensity levels in the image. Similarly, given a specific value of sk, the discrete
formulation of Eq. (3.3-11) involves computing the transformation function
q
G(zq) = (L - 1) a pz (zi) (3.3-14)
i=0
G(zq) = sk (3.3-15)
where pz (zi), is the ith value of the specified histogram. As before, we find the
desired value zq by obtaining the inverse transformation:
zq = G-1(sk) (3.3-16)
In other words, this operation gives a value of z for each value of s; thus, it per-
forms a mapping from s to z.
In practice, we do not need to compute the inverse of G. Because we deal
with intensity levels that are integers (e.g., 0 to 255 for an 8-bit image), it is a
simple matter to compute all the possible values of G using Eq. (3.3-14) for
q = 0, 1, 2, Á , L - 1. These values are scaled and rounded to their nearest
integer values spanning the range [0, L - 1]. The values are stored in a table.
Then, given a particular value of sk, we look for the closest match in the values
stored in the table. If, for example, the 64th entry in the table is the closest to
sk, then q = 63 (recall that we start counting at 0) and z63 is the best solution
to Eq. (3.3-15). Thus, the given value sk would be associated with z63 (i.e., that
3.3 ■ Histogram Processing 133
specific value of sk would map to z63). Because the zs are intensities used
as the basis for specifying the histogram pz(z), it follows that z0 = 0,
z1 = 1, Á , zL - 1 = L - 1, so z63 would have the intensity value 63. By re-
peating this procedure, we would find the mapping of each value of sk to the
value of zq that is the closest solution to Eq. (3.3-15). These mappings are the
solution to the histogram-specification problem.
Recalling that the sks are the values of the histogram-equalized image, we
may summarize the histogram-specification procedure as follows:
1. Compute the histogram pr(r) of the given image, and use it to find the his-
togram equalization transformation in Eq. (3.3-13). Round the resulting
values, sk, to the integer range [0, L - 1].
2. Compute all values of the transformation function G using the Eq. (3.3-14)
for q = 0, 1, 2, Á , L - 1, where pz (zi) are the values of the specified his-
togram. Round the values of G to integers in the range [0, L - 1]. Store
the values of G in a table.
3. For every value of sk, k = 0, 1, 2, Á , L - 1, use the stored values of G
from step 2 to find the corresponding value of zq so that G(zq) is closest to
sk and store these mappings from s to z. When more than one value of zq
satisfies the given sk (i.e., the mapping is not unique), choose the smallest
value by convention.
4. Form the histogram-specified image by first histogram-equalizing the
input image and then mapping every equalized pixel value, sk, of this
image to the corresponding value zq in the histogram-specified image
using the mappings found in step 3. As in the continuous case, the inter-
mediate step of equalizing the input image is conceptual. It can be skipped
by combining the two transformation functions, T and G-1, as Example 3.8
shows.
As mentioned earlier, for G-1 to satisfy conditions (a¿) and (b), G has to be
strictly monotonic, which, according to Eq. (3.3-14), means that none of the val-
ues pz(zi) of the specified histogram can be zero (Problem 3.10). When working
with discrete quantities, the fact that this condition may not be satisfied is not a
serious implementation issue, as step 3 above indicates. The following example
illustrates this numerically.
■ Consider again the 64 * 64 hypothetical image from Example 3.5, whose EXAMPLE 3.8:
histogram is repeated in Fig. 3.22(a). It is desired to transform this histogram A simple example
of histogram
so that it will have the values specified in the second column of Table 3.2.
specification.
Figure 3.22(b) shows a sketch of this histogram.
The first step in the procedure is to obtain the scaled histogram-equalized
values, which we did in Example 3.5:
s0 = 1 s2 = 5 s4 = 7 s6 = 7
s1 = 3 s3 = 6 s5 = 7 s7 = 7
134 Chapter 3 ■ Intensity Transformations and Spatial Filtering
a b pr (rk) pz (zq)
c d
FIGURE 3.22 .30 .30
(a) Histogram of a .25 .25
3-bit image. (b) .20 .20
Specified .15 .15
histogram.
(c) Transformation .10 .10
function obtained .05 .05
from the specified rk zq
histogram. 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
(d) Result of
G (zq) pz (zq)
performing
histogram
7 .25
specification. 6
Compare .20
5
(b) and (d). 4 .15
3 .10
2
1 .05
zq zq
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
In the next step, we compute all the values of the transformation function, G,
using Eq. (3.3-14):
0
G(z0) = 7 a pz (zj) = 0.00
j=0
Similarly,
j=0
and
G(z2) = 0.00 G(z4) = 2.45 G(z6) = 5.95
TABLE 3.2
Specified Actual
Specified and
zq pz (zq) pz(zk)
actual histograms
(the values in the z0 = 0 0.00 0.00
third column are z1 = 1 0.00 0.00
from the z2 = 2 0.00 0.00
computations z3 = 3 0.15 0.19
performed in the z4 = 4 0.20 0.25
body of Example z5 = 5 0.30 0.21
3.8). z6 = 6 0.20 0.24
z7 = 7 0.15 0.11
3.3 ■ Histogram Processing 135
These results are summarized in Table 3.3, and the transformation function is
sketched in Fig. 3.22(c). Observe that G is not strictly monotonic, so condition
(a¿) is violated. Therefore, we make use of the approach outlined in step 3 of
the algorithm to handle this situation.
In the third step of the procedure, we find the smallest value of zq so that
the value G(zq) is the closest to sk. We do this for every value of sk to create
the required mappings from s to z. For example, s0 = 1, and we see that
G(z3) = 1, which is a perfect match in this case, so we have the correspon-
dence s0 : z3. That is, every pixel whose value is 1 in the histogram equalized
image would map to a pixel valued 3 (in the corresponding location) in the
histogram-specified image. Continuing in this manner, we arrive at the map-
pings in Table 3.4.
In the final step of the procedure, we use the mappings in Table 3.4 to map
every pixel in the histogram equalized image into a corresponding pixel in the
newly created histogram-specified image. The values of the resulting his-
togram are listed in the third column of Table 3.2, and the histogram is
sketched in Fig. 3.22(d). The values of pz (zq) were obtained using the same
procedure as in Example 3.5. For instance, we see in Table 3.4 that s = 1 maps
to z = 3, and there are 790 pixels in the histogram-equalized image with a
value of 1. Therefore, pz (z3) = 790>4096 = 0.19.
Although the final result shown in Fig. 3.22(d) does not match the specified
histogram exactly, the general trend of moving the intensities toward the high
end of the intensity scale definitely was achieved. As mentioned earlier, ob-
taining the histogram-equalized image as an intermediate step is useful for ex-
plaining the procedure, but this is not necessary. Instead, we could list the
mappings from the rs to the ss and from the ss to the zs in a three-column
TABLE 3.3
zq G(zq)
All possible
z0 = 0 0 values of the
z1 = 1 0 transformation
z2 = 2 0 function G scaled,
z3 = 3 1 rounded, and
z4 = 4 2 ordered with
z5 = 5 5 respect to z.
z6 = 6 6
z7 = 7 7
136 Chapter 3 ■ Intensity Transformations and Spatial Filtering
TABLE 3.4
sk : zq
Mappings of all
the values of sk 1 : 3
into corresponding 3 : 4
values of zq. 5 : 5
6 : 6
7 : 7
table. Then, we would use those mappings to map the original pixels directly
into the pixels of the histogram-specified image. ■
EXAMPLE 3.9: ■ Figure 3.23(a) shows an image of the Mars moon, Phobos, taken by NASA’s
Comparison Mars Global Surveyor. Figure 3.23(b) shows the histogram of Fig. 3.23(a). The
between
image is dominated by large, dark areas, resulting in a histogram characterized
histogram
equalization and by a large concentration of pixels in the dark end of the gray scale. At first
histogram glance, one might conclude that histogram equalization would be a good ap-
matching. proach to enhance this image, so that details in the dark areas become more
visible. It is demonstrated in the following discussion that this is not so.
Figure 3.24(a) shows the histogram equalization transformation [Eq. (3.3-8)
or (3.3-13)] obtained from the histogram in Fig. 3.23(b). The most relevant
characteristic of this transformation function is how fast it rises from intensity
level 0 to a level near 190. This is caused by the large concentration of pixels in
the input histogram having levels near 0. When this transformation is applied
to the levels of the input image to obtain a histogram-equalized result, the net
effect is to map a very narrow interval of dark pixels into the upper end of the
gray scale of the output image. Because numerous pixels in the input image
have levels precisely in this interval, we would expect the result to be an image
with a light, washed-out appearance. As Fig. 3.24(b) shows, this is indeed the
a b
FIGURE 3.23
(a) Image of the
Mars moon
7.00
Phobos taken by
Number of pixels ( 10 4)
NASA’s Mars
Global Surveyor. 5.25
(b) Histogram.
(Original image
3.50
courtesy of
NASA.)
1.75
0
0 64 128 192 255
Intensity
3.3 ■ Histogram Processing 137
255 a b
c
192 FIGURE 3.24
Output intensity
(a) Transformation
function for
128 histogram
equalization.
(b) Histogram-
64
equalized image
(note the washed-
0 out appearance).
0 64 128 192 255 (c) Histogram
Input intensity of (b).
7.00
Number of pixels ( 10 4)
5.25
3.50
1.75
0
0 64 128 192 255
Intensity
case. The histogram of this image is shown in Fig. 3.24(c). Note how all the in-
tensity levels are biased toward the upper one-half of the gray scale.
Because the problem with the transformation function in Fig. 3.24(a) was
caused by a large concentration of pixels in the original image with levels near
0, a reasonable approach is to modify the histogram of that image so that it
does not have this property. Figure 3.25(a) shows a manually specified function
that preserves the general shape of the original histogram, but has a smoother
transition of levels in the dark region of the gray scale. Sampling this function
into 256 equally spaced discrete values produced the desired specified his-
togram. The transformation function G(z) obtained from this histogram using
Eq. (3.3-14) is labeled transformation (1) in Fig. 3.25(b). Similarly, the inverse
transformation G-1(s) from Eq. (3.3-16) (obtained using the step-by-step pro-
cedure discussed earlier) is labeled transformation (2) in Fig. 3.25(b). The en-
hanced image in Fig. 3.25(c) was obtained by applying transformation (2) to
the pixels of the histogram-equalized image in Fig. 3.24(b). The improvement
of the histogram-specified image over the result obtained by histogram equal-
ization is evident by comparing these two images. It is of interest to note that a
rather modest change in the original histogram was all that was required to
obtain a significant improvement in appearance. Figure 3.25(d) shows the his-
togram of Fig. 3.25(c). The most distinguishing feature of this histogram is
how its low end has shifted right toward the lighter region of the gray scale
(but not excessively so), as desired. ■
138 Chapter 3 ■ Intensity Transformations and Spatial Filtering
a c 7.00
b
Number of pixels ( 10 4)
d
5.25
FIGURE 3.25
(a) Specified
histogram. 3.50
(b) Transformations.
(c) Enhanced image
1.75
using mappings
from curve (2).
(d) Histogram of (c). 0
0 64 128 192 255
Intensity
255
192
Output intensity
(1)
128
(2)
64
0
0 64 128 192 255
Input intensity
7.00
Number of pixels ( 104)
5.25
3.50
1.75
0
0 64 128 192 255
Intensity
■ Figure 3.26(a) shows an 8-bit, 512 * 512 image that at first glance appears EXAMPLE 3.10:
to contain five black squares on a gray background. The image is slightly noisy, Local histogram
equalization.
but the noise is imperceptible. Figure 3.26(b) shows the result of global his-
togram equalization. As often is the case with histogram equalization of
smooth, noisy regions, this image shows significant enhancement of the noise.
Aside from the noise, however, Fig. 3.26(b) does not reveal any new significant
details from the original, other than a very faint hint that the top left and bot-
tom right squares contain an object. Figure 3.26(c) was obtained using local
histogram equalization with a neighborhood of size 3 * 3. Here, we see signif-
icant detail contained within the dark squares. The intensity values of these ob-
jects were too close to the intensity of the large squares, and their sizes were
too small, to influence global histogram equalization significantly enough to
show this detail. ■
a b c
FIGURE 3.26 (a) Original image. (b) Result of global histogram equalization. (c) Result of local
histogram equalization applied to (a), using a neighborhood of size 3 * 3.
We follow convention in where m is the mean (average intensity) value of r (i.e., the average intensity
using m for the mean
value. Do not confuse it
of the pixels in the image):
with the same symbol L-1
used to denote the num-
ber of rows in an m * n m = a ri p(ri) (3.3-18)
neighborhood, in which i=0
we also follow notational
convention. The second moment is particularly important:
L-1
m2(r) = a (ri - m)2 p(ri) (3.3-19)
i=0
C f(x, y) - m D
sometimes as MN - 1
1 M-1 N-1 2 instead of MN. This is
s2 = a a (3.3-21) done to obtain a so-
MN x = 0 y = 0 called unbiased estimate
of the variance. Howev-
for x = 0, 1, 2, Á , M - 1 and y = 0, 1, 2, Á , N - 1. In other words, as we er, we are more interest-
ed in Eqs. (3.3-21) and
know, the mean intensity of an image can be obtained simply by summing the (3.3-19) agreeing when
values of all its pixels and dividing the sum by the total number of pixels in the the histogram in the lat-
ter equation is computed
image.A similar interpretation applies to Eq. (3.3-21).As we illustrate in the fol- from the same image
lowing example, the results obtained using these two equations are identical to used in Eq. (3.3-21). For
this we require the MN
the results obtained using Eqs. (3.3-18) and (3.3-19), provided that the histogram term. The difference is
used in these equations is computed from the same image used in Eqs. (3.3-20) negligible for any image
of practical size.
and (3.3-21).
■ Before proceeding, it will be useful to work through a simple numerical ex- EXAMPLE 3.11:
ample to fix ideas. Consider the following 2-bit image of size 5 * 5: Computing
histogram
statistics.
0 0 1 1 2
1 2 3 0 1
3 3 2 2 0
2 3 1 0 0
1 1 3 2 2
The pixels are represented by 2 bits; therefore, L = 4 and the intensity levels
are in the range [0, 3]. The total number of pixels is 25, so the histogram has the
components
6 7
p(r0) = = 0.24; p(r1) = = 0.28;
25 25
7 5
p(r2) = = 0.28; p(r3) = = 0.20
25 25
where the numerator in p(ri) is the number of pixels in the image with intensity
level ri. We can compute the average value of the intensities in the image using
Eq. (3.3-18):
3
m = a ri p(ri)
i=0
= 1.44
142 Chapter 3 ■ Intensity Transformations and Spatial Filtering
As expected, the results agree. Similarly, the result for the variance is the same
(1.1264) using either Eq. (3.3-19) or (3.3-21). ■
We consider two uses of the mean and variance for enhancement purposes.
The global mean and variance are computed over an entire image and are use-
ful for gross adjustments in overall intensity and contrast. A more powerful
use of these parameters is in local enhancement, where the local mean and
variance are used as the basis for making changes that depend on image char-
acteristics in a neighborhood about each pixel in an image.
Let (x, y) denote the coordinates of any pixel in a given image, and let Sxy
denote a neighborhood (subimage) of specified size, centered on (x, y). The
mean value of the pixels in this neighborhood is given by the expression
L-1
mSxy = a ri pSxy (ri) (3.3-22)
i=0
where pSxy is the histogram of the pixels in region Sxy. This histogram has L
components, corresponding to the L possible intensity values in the input image.
However, many of the components are 0, depending on the size of Sxy. For ex-
ample, if the neighborhood is of size 3 * 3 and L = 256, only between 1 and 9
of the 256 components of the histogram of the neighborhood will be nonzero.
These non-zero values will correspond to the number of different intensities in
Sxy (the maximum number of possible different intensities in a 3 * 3 region is 9,
and the minimum is 1).
The variance of the pixels in the neighborhood similarly is given by
L-1
s2Sxy = a (ri - mSxy)2 pSxy(ri) (3.3-23)
i=0
EXAMPLE 3.12: ■ Figure 3.27(a) shows an SEM (scanning electron microscope) image of a
Local enhance- tungsten filament wrapped around a support. The filament in the center of
ment using
the image and its support are quite clear and easy to study. There is another
histogram
statistics. filament structure on the right, dark side of the image, but it is almost imper-
ceptible, and its size and other characteristics certainly are not easily discern-
able. Local enhancement by contrast manipulation is an ideal approach to
problems such as this, in which parts of an image may contain hidden features.
3.3 ■ Histogram Processing 143
a b c
FIGURE 3.27 (a) SEM image of a tungsten filament magnified approximately 130 *.
(b) Result of global histogram equalization. (c) Image enhanced using local histogram
statistics. (Original image courtesy of Mr. Michael Shaffer, Department of Geological
Sciences, University of Oregon, Eugene.)
In this particular case, the problem is to enhance dark areas while leaving
the light area as unchanged as possible because it does not require enhance-
ment. We can use the concepts presented in this section to formulate an en-
hancement method that can tell the difference between dark and light and, at
the same time, is capable of enhancing only the dark areas. A measure of
whether an area is relatively light or dark at a point (x, y) is to compare the av-
erage local intensity, mSxy, to the average image intensity, called the global
mean and denoted mG. This quantity is obtained with Eq. (3.3-18) or (3.3-20)
using the entire image. Thus, we have the first element of our enhancement
scheme: We will consider the pixel at a point (x, y) as a candidate for processing
if mSxy … k0 mG, where k0 is a positive constant with value less than 1.0.
Because we are interested in enhancing areas that have low contrast, we also
need a measure to determine whether the contrast of an area makes it a candi-
date for enhancement. We consider the pixel at a point (x, y) as a candidate for
enhancement if sSxy … k2sG, where sG is the global standard deviation
obtained using Eqs. (3.3-19) or (3.3-21) and k2 is a positive constant. The value
of this constant will be greater than 1.0 if we are interested in enhancing light
areas and less than 1.0 for dark areas.
Finally, we need to restrict the lowest values of contrast we are willing to ac-
cept; otherwise the procedure would attempt to enhance constant areas, whose
standard deviation is zero. Thus, we also set a lower limit on the local standard
deviation by requiring that k1sG … sSxy, with k1 6 k2. A pixel at (x, y) that
meets all the conditions for local enhancement is processed simply by multi-
plying it by a specified constant, E, to increase (or decrease) the value of its in-
tensity level relative to the rest of the image. Pixels that do not meet the
enhancement conditions are not changed.
144 Chapter 3 ■ Intensity Transformations and Spatial Filtering
Observe that the center coefficient of the filter, w(0, 0), aligns with the pixel at
location (x, y). For a mask of size m * n, we assume that m = 2a + 1 and
n = 2b + 1, where a and b are positive integers. This means that our focus in
the following discussion is on filters of odd size, with the smallest being of size It certainly is possible to
work with filters of even
3 * 3. In general, linear spatial filtering of an image of size M * N with a fil- size or mixed even and
ter of size m * n is given by the expression: odd sizes. However,
working with odd sizes
a b simplifies indexing and
g(x, y) = a a w(s, t)f(x + s, y + t) also is more intuitive
because the filters have
s = -a t = -b
centers falling on integer
values.
where x and y are varied so that each pixel in w visits every pixel in f.
†
The filtered pixel value typically is assigned to a corresponding location in a new image created to hold
the results of filtering. It is seldom the case that filtered pixels replace the values of the corresponding
location in the original image, as this would change the content of the image while filtering still is being
performed.
146 Chapter 3 ■ Intensity Transformations and Spatial Filtering
Image origin
y
Filter mask
Image pixels
Filter coefficients
f(x 1, y 1) f (x 1, y) f (x 1, y 1)
f (x 1, y 1) f(x 1, y) f (x 1, y 1)
Pixels of image
section under filter
FIGURE 3.28 The mechanics of linear spatial filtering using a 3 * 3 filter mask. The form chosen to denote
the coordinates of the filter mask coefficients simplifies writing expressions for linear filtering.
Correlation Convolution
(b) 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 (j)
1 2 3 2 8 8 2 3 2 1
Starting position alignment
Zero padding
(c) 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 (k)
1 2 3 2 8 8 2 3 2 1
(d) 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 (l)
1 2 3 2 8 8 2 3 2 1
Position after one shift
(e) 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 (m)
1 2 3 2 8 8 2 3 2 1
Position after four shifts
(f) 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 (n)
1 2 3 2 8 8 2 3 2 1
Final position
FIGURE 3.29 Illustration of 1-D correlation and convolution of a filter with a discrete unit impulse. Note that
correlation and convolution are functions of displacement.
are parts of the functions that do not overlap. The solution to this problem is to
pad f with enough 0s on either side to allow each pixel in w to visit every pixel in Zero padding is not the
only option. For example,
f. If the filter is of size m, we need m - 1 0s on either side of f. Figure 3.29(c) we could duplicate the
shows a properly padded function. The first value of correlation is the sum of value of the first and last
element m - 1 times on
products of f and w for the initial position shown in Fig. 3.29(c) (the sum of each side of f, or mirror
products is 0). This corresponds to a displacement x = 0. To obtain the second the first and last m - 1
elements and use the
value of correlation, we shift w one pixel location to the right (a displacement of mirrored values for
x = 1) and compute the sum of products. The result again is 0. In fact, the first padding.
nonzero result is when x = 3, in which case the 8 in w overlaps the 1 in f and the
result of correlation is 8. Proceeding in this manner, we obtain the full correlation
result in Fig. 3.29(g). Note that it took 12 values of x (i.e., x = 0, 1, 2, Á , 11) to
fully slide w past f so that each pixel in w visited every pixel in f. Often, we like
to work with correlation arrays that are the same size as f, in which case we crop
the full correlation to the size of the original function, as Fig. 3.29(h) shows.
148 Chapter 3 ■ Intensity Transformations and Spatial Filtering
There are two important points to note from the discussion in the preceding
paragraph. First, correlation is a function of displacement of the filter. In other
words, the first value of correlation corresponds to zero displacement of the
filter, the second corresponds to one unit displacement, and so on. The second
thing to notice is that correlating a filter w with a function that contains all 0s
and a single 1 yields a result that is a copy of w, but rotated by 180°. We call a
function that contains a single 1 with the rest being 0s a discrete unit impulse.
So we conclude that correlation of a function with a discrete unit impulse
yields a rotated version of the function at the location of the impulse.
The concept of convolution is a cornerstone of linear system theory. As you
will learn in Chapter 4, a fundamental property of convolution is that convolv-
ing a function with a unit impulse yields a copy of the function at the location
of the impulse. We saw in the previous paragraph that correlation yields a copy
Note that rotation by of the function also, but rotated by 180°. Therefore, if we pre-rotate the filter
180° is equivalent to flip-
ping the function hori-
and perform the same sliding sum of products operation, we should be able to
zontally. obtain the desired result. As the right column in Fig. 3.29 shows, this indeed is
the case. Thus, we see that to perform convolution all we do is rotate one func-
tion by 180° and perform the same operations as in correlation. As it turns out,
it makes no difference which of the two functions we rotate.
The preceding concepts extend easily to images, as Fig. 3.30 shows. For a fil-
ter of size m * n, we pad the image with a minimum of m - 1 rows of 0s at
the top and bottom and n - 1 columns of 0s on the left and right. In this case,
m and n are equal to 3, so we pad f with two rows of 0s above and below and
two columns of 0s to the left and right, as Fig. 3.30(b) shows. Figure 3.30(c)
shows the initial position of the filter mask for performing correlation, and
Fig. 3.30(d) shows the full correlation result. Figure 3.30(e) shows the corre-
In 2-D, rotation by 180° sponding cropped result. Note again that the result is rotated by 180°. For con-
is equivalent to flipping
the mask along one axis
volution, we pre-rotate the mask as before and repeat the sliding sum of
and then the other. products just explained. Figures 3.30(f) through (h) show the result. You see
again that convolution of a function with an impulse copies the function at the
location of the impulse. It should be clear that, if the filter mask is symmetric,
correlation and convolution yield the same result.
If, instead of containing a single 1, image f in Fig. 3.30 had contained a re-
gion identically equal to w, the value of the correlation function (after nor-
malization) would have been maximum when w was centered on that region
of f. Thus, as you will see in Chapter 12, correlation can be used also to find
matches between images.
Summarizing the preceding discussion in equation form, we have that the
correlation of a filter w(x, y) of size m * n with an image f(x, y), denoted as
w(x, y) f(x, y), is given by the equation listed at the end of the last section,
which we repeat here for convenience:
a b
w(x, y) f(x, y) = a a w(s, t)f(x + s, y + t) (3.4-1)
s = -a t = -b
This equation is evaluated for all values of the displacement variables x and y
so that all elements of w visit every pixel in f, where we assume that f has been
padded appropriately. As explained earlier, a = (m - 1)>2, b = (n - 1)>2,
and we assume for notational convenience that m and n are odd integers.
3.4 ■ Fundamentals of Spatial Filtering 149
In a similar manner, the convolution of w(x, y) and f(x, y), denoted by Often, when the mean-
ing is clear, we denote
w(x, y) f(x, y),† is given by the expression the result of correlation
or convolution by a func-
a b tion g (x, y), instead of
w(x, y) f(x, y) = a a w(s, t)f(x - s, y - t) (3.4-2) writing w(x, y) f(x, y)
s = -a t = -b or w(x, y) f(x, y). For
example, see the equa-
tion at the end of the
where the minus signs on the right flip f (i.e., rotate it by 180°). Flipping and previous section, and
shifting f instead of w is done for notational simplicity and also to follow Eq. (3.5-1).
convention. The result is the same. As with correlation, this equation is eval-
uated for all values of the displacement variables x and y so that every ele-
ment of w visits every pixel in f, which we assume has been padded
appropriately. You should expand Eq. (3.4-2) for a 3 * 3 mask and convince
yourself that the result using this equation is identical to the example in
Fig. 3.30. In practice, we frequently work with an algorithm that implements
†
Because correlation and convolution are commutative, we have that w(x, y) f(x, y)
= f(x, y) w(x, y) and w(x, y) f(x, y) = f(x, y) w(x, y).
150 Chapter 3 ■ Intensity Transformations and Spatial Filtering
R = w1 z1 + w2 z2 + Á + wmn zmn
mn
= a wk zk (3.4-3)
k=1
Consult the Tutorials sec-
tion of the book Web site = wTz
for a brief review of vec-
tors and matrices.
where the ws are the coefficients of an m * n filter and the zs are the corre-
sponding image intensities encompassed by the filter. If we are interested in
using Eq. (3.4-3) for correlation, we use the mask as given. To use the same
equation for convolution, we simply rotate the mask by 180°, as explained in
the last section. It is implied that Eq. (3.4-3) holds for a particular pair of coor-
dinates (x, y). You will see in the next section why this notation is convenient
for explaining the characteristics of a given linear filter.
3.4 ■ Fundamentals of Spatial Filtering 151
FIGURE 3.31
w1 w2 w3 Another
representation of
a general 3 * 3
filter mask.
w4 w5 w6
w7 w8 w9
But this is the same as Eq. (3.4-4) with coefficient values wi = 1>9. In other
words, a linear filtering operation with a 3 * 3 mask whose coefficients are 1> 9
implements the desired averaging. As we discuss in the next section, this oper-
ation results in image smoothing. We discuss in the following sections a num-
ber of other filter masks based on this basic approach.
In some applications, we have a continuous function of two variables, and
the objective is to obtain a spatial filter mask based on that function. For ex-
ample, a Gaussian function of two variables has the basic form
x2 + y2
h(x, y) = e - 2s2
which is the average of the intensity levels of the pixels in the 3 * 3 neighbor-
hood defined by the mask, as discussed earlier. Note that, instead of being 1>9,
3.5 ■ Smoothing Spatial Filters 153
a b
1 1 1 1 2 1 FIGURE 3.32 Two
3 * 3 smoothing
(averaging) filter
1 1
1 1 1 2 4 2 masks. The
9 16
constant multipli-
er in front of each
1 1 1 1 2 1 mask is equal to 1
divided by the
sum of the values
of its coefficients,
as is required to
compute an
average.
the coefficients of the filter are all 1s. The idea here is that it is computationally
more efficient to have coefficients valued 1. At the end of the filtering process
the entire image is divided by 9. An m * n mask would have a normalizing
constant equal to 1> mn. A spatial averaging filter in which all coefficients are
equal sometimes is called a box filter.
The second mask in Fig. 3.32 is a little more interesting. This mask yields a so-
called weighted average, terminology used to indicate that pixels are multiplied by
different coefficients, thus giving more importance (weight) to some pixels at the
expense of others. In the mask shown in Fig. 3.32(b) the pixel at the center of the
mask is multiplied by a higher value than any other, thus giving this pixel more
importance in the calculation of the average. The other pixels are inversely
weighted as a function of their distance from the center of the mask.The diagonal
terms are further away from the center than the orthogonal neighbors (by a fac-
tor of 12) and, thus, are weighed less than the immediate neighbors of the center
pixel. The basic strategy behind weighing the center point the highest and then
reducing the value of the coefficients as a function of increasing distance from the
origin is simply an attempt to reduce blurring in the smoothing process.We could
have chosen other weights to accomplish the same general objective. However,
the sum of all the coefficients in the mask of Fig. 3.32(b) is equal to 16, an attrac-
tive feature for computer implementation because it is an integer power of 2. In
practice, it is difficult in general to see differences between images smoothed by
using either of the masks in Fig. 3.32, or similar arrangements, because the area
spanned by these masks at any one location in an image is so small.
With reference to Eq. (3.4-1), the general implementation for filtering an
M * N image with a weighted averaging filter of size m * n (m and n odd) is
given by the expression
a b
a a w(s, t)f(x + s, y + t)
s = -a t = -b
g(x, y) = a b
(3.5-1)
a a w(s, t)
s = -a t = -b
The parameters in this equation are as defined in Eq. (3.4-1). As before, it is un-
derstood that the complete filtered image is obtained by applying Eq. (3.5-1)
for x = 0, 1, 2, Á , M - 1 and y = 0, 1, 2, Á , N - 1. The denominator in
154 Chapter 3 ■ Intensity Transformations and Spatial Filtering
Eq. (3.5-1) is simply the sum of the mask coefficients and, therefore, it is a con-
stant that needs to be computed only once.
EXAMPLE 3.13: ■ The effects of smoothing as a function of filter size are illustrated in Fig. 3.33,
Image smoothing which shows an original image and the corresponding smoothed results ob-
with masks of
tained using square averaging filters of sizes m = 3, 5, 9, 15, and 35 pixels, re-
various sizes.
spectively. The principal features of these results are as follows: For m = 3, we
note a general slight blurring throughout the entire image but, as expected, de-
tails that are of approximately the same size as the filter mask are affected con-
siderably more. For example, the 3 * 3 and 5 * 5 black squares in the image,
the small letter “a,” and the fine grain noise show significant blurring when com-
pared to the rest of the image. Note that the noise is less pronounced, and the
jagged borders of the characters were pleasingly smoothed.
The result for m = 5 is somewhat similar, with a slight further increase in
blurring. For m = 9 we see considerably more blurring, and the 20% black cir-
cle is not nearly as distinct from the background as in the previous three im-
ages, illustrating the blending effect that blurring has on objects whose
intensities are close to that of its neighboring pixels. Note the significant fur-
ther smoothing of the noisy rectangles. The results for m = 15 and 35 are ex-
treme with respect to the sizes of the objects in the image. This type of
aggresive blurring generally is used to eliminate small objects from an image.
For instance, the three small squares, two of the circles, and most of the noisy
rectangle areas have been blended into the background of the image in
Fig. 3.33(f). Note also in this figure the pronounced black border. This is a re-
sult of padding the border of the original image with 0s (black) and then
trimming off the padded area after filtering. Some of the black was blended
into all filtered images, but became truly objectionable for the images
smoothed with the larger filters. ■
FIGURE 3.33 (a) Original image, of size 500 * 500 pixels. (b)–(f) Results of smoothing a b
with square averaging filter masks of sizes m = 3, 5, 9, 15, and 35, respectively. The black c d
squares at the top are of sizes 3, 5, 9, 15, 25, 35, 45, and 55 pixels, respectively; their borders e f
are 25 pixels apart. The letters at the bottom range in size from 10 to 24 points, in
increments of 2 points; the large letter at the top is 60 points. The vertical bars are 5 pixels
wide and 100 pixels high; their separation is 20 pixels. The diameter of the circles is 25
pixels, and their borders are 15 pixels apart; their intensity levels range from 0% to 100%
black in increments of 20%. The background of the image is 10% black. The noisy
rectangles are of size 50 * 120 pixels.
156 Chapter 3 ■ Intensity Transformations and Spatial Filtering
a b c
FIGURE 3.34 (a) Image of size 528 * 485 pixels from the Hubble Space Telescope. (b) Image filtered with a
15 * 15 averaging mask. (c) Result of thresholding (b). (Original image courtesy of NASA.)
a b c
FIGURE 3.35 (a) X-ray image of circuit board corrupted by salt-and-pepper noise. (b) Noise reduction with
a 3 * 3 averaging mask. (c) Noise reduction with a 3 * 3 median filter. (Original image courtesy of Mr.
Joseph E. Pascente, Lixi, Inc.)
Although the median filter is by far the most useful order-statistic filter in
image processing, it is by no means the only one. The median represents the
50th percentile of a ranked set of numbers, but recall from basic statistics that See Section 10.3.5 regard-
ing percentiles.
ranking lends itself to many other possibilities. For example, using the 100th
percentile results in the so-called max filter, which is useful for finding the
brightest points in an image. The response of a 3 * 3 max filter is given by
R = max5zk ƒ k = 1, 2, Á , 96. The 0th percentile filter is the min filter, used
for the opposite purpose. Median, max, min, and several other nonlinear filters
are considered in more detail in Section 5.3.
■ Figure 3.35(a) shows an X-ray image of a circuit board heavily corrupted EXAMPLE 3.14:
by salt-and-pepper noise. To illustrate the point about the superiority of medi- Use of median
filtering for noise
an filtering over average filtering in situations such as this, we show in Fig.
reduction.
3.35(b) the result of processing the noisy image with a 3 * 3 neighborhood av-
eraging mask, and in Fig. 3.35(c) the result of using a 3 * 3 median filter. The
averaging filter blurred the image and its noise reduction performance was
poor. The superiority in all respects of median over average filtering in this
case is quite evident. In general, median filtering is much better suited than av-
eraging for the removal of salt-and-pepper noise. ■
and the discussion in this section deals with various ways of defining and imple-
menting operators for sharpening by digital differentiation. Fundamentally, the
strength of the response of a derivative operator is proportional to the degree
of intensity discontinuity of the image at the point at which the operator is ap-
plied. Thus, image differentiation enhances edges and other discontinuities
(such as noise) and deemphasizes areas with slowly varying intensities.
3.6.1 Foundation
In the two sections that follow, we consider in some detail sharpening filters
that are based on first- and second-order derivatives, respectively. Before pro-
ceeding with that discussion, however, we stop to look at some of the funda-
mental properties of these derivatives in a digital context. To simplify the
explanation, we focus attention initially on one-dimensional derivatives. In
particular, we are interested in the behavior of these derivatives in areas of
constant intensity, at the onset and end of discontinuities (step and ramp dis-
continuities), and along intensity ramps. As you will see in Chapter 10, these
types of discontinuities can be used to model noise points, lines, and edges in
an image. The behavior of derivatives during transitions into and out of these
image features also is of interest.
The derivatives of a digital function are defined in terms of differences.
There are various ways to define these differences. However, we require that
any definition we use for a first derivative (1) must be zero in areas of constant
intensity; (2) must be nonzero at the onset of an intensity step or ramp; and
(3) must be nonzero along ramps. Similarly, any definition of a second deriva-
tive (1) must be zero in constant areas; (2) must be nonzero at the onset and
end of an intensity step or ramp; and (3) must be zero along ramps of constant
slope. Because we are dealing with digital quantities whose values are finite,
the maximum possible intensity change also is finite, and the shortest distance
over which that change can occur is between adjacent pixels.
A basic definition of the first-order derivative of a one-dimensional func-
tion f(x) is the difference
We return to Eq. (3.6-1)
in Section 10.2.1 and
0f
show how it follows from
= f(x + 1) - f(x) (3.6-1)
0x
a Taylor series expansion.
For now, we accept it as a
definition. We used a partial derivative here in order to keep the notation the same as
when we consider an image function of two variables, f(x, y), at which time we
will be dealing with partial derivatives along the two spatial axes. Use of a par-
tial derivative in the present discussion does not affect in any way the nature
of what we are trying to accomplish. Clearly, 0f>0x = df>dx when there is
only one variable in the function; the same is true for the second derivative.
We define the second-order derivative of f(x) as the difference
0 2f
= f(x + 1) + f(x - 1) - 2f(x) (3.6-2)
0x2
It is easily verified that these two definitions satisfy the conditions stated
above. To illustrate this, and to examine the similarities and differences between
3.6 ■ Sharpening Spatial Filters 159
Intensity transition a
6 b
5 Constant c
intensity Ramp FIGURE 3.36
4 Step
Intensity
Illustration of the
3
first and second
2 derivatives of a
1 1-D digital
0 x function
representing a
Scan x section of a
6 6 6 6 5 4 3 2 1 1 1 1 1 1 6 6 6 6 6
line horizontal
1st derivative 0 0 1 1 1 1 1 0 0 0 0 0 5 0 0 0 0 intensity profile
2nd derivative 0 0 1 0 0 0 0 1 0 0 0 0 5 5 0 0 0 from an image. In
5 (a) and (c) data
4 points are joined
3 by dashed lines as
a visualization aid.
2
1
Intensity
0 x
1 Zero crossing
2
3 First derivative
4 Second derivative
5
the step; similarly, the second derivative is nonzero at the onset and end of both
the ramp and the step; therefore, property (2) is satisfied for both derivatives. Fi-
nally, we see that property (3) is satisfied also for both derivatives because the
first derivative is nonzero and the second is zero along the ramp. Note that the
sign of the second derivative changes at the onset and end of a step or ramp. In
fact, we see in Fig. 3.36(c) that in a step transition a line joining these two values
crosses the horizontal axis midway between the two extremes. This zero crossing
property is quite useful for locating edges, as you will see in Chapter 10.
Edges in digital images often are ramp-like transitions in intensity, in which
case the first derivative of the image would result in thick edges because the de-
rivative is nonzero along a ramp. On the other hand, the second derivative would
produce a double edge one pixel thick, separated by zeros. From this, we con-
clude that the second derivative enhances fine detail much better than the first
derivative, a property that is ideally suited for sharpening images.Also, as you will
learn later in this section, second derivatives are much easier to implement than
first derivates, so we focus our attention initially on second derivatives.
Therefore, it follows from the preceding three equations that the discrete
Laplacian of two variables is
§2f(x, y) = f(x + 1, y) + f(x - 1, y) + f(x, y + 1) + f(x, y - 1)
-4f(x, y) (3.6-6)
This equation can be implemented using the filter mask in Fig. 3.37(a), which
gives an isotropic result for rotations in increments of 90°. The mechanics of
implementation are as in Section 3.5.1 for linear smoothing filters. We simply
are using different coefficients here.
The diagonal directions can be incorporated in the definition of the digital
Laplacian by adding two more terms to Eq. (3.6-6), one for each of the two di-
agonal directions. The form of each new term is the same as either Eq. (3.6-4) or
(3.6-5), but the coordinates are along the diagonals. Because each diagonal term
also contains a -2f(x, y) term, the total subtracted from the difference terms
now would be -8f(x, y). Figure 3.37(b) shows the filter mask used to imple-
ment this new definition. This mask yields isotropic results in increments of 45°.
You are likely to see in practice the Laplacian masks in Figs. 3.37(c) and (d).
They are obtained from definitions of the second derivatives that are the nega-
tives of the ones we used in Eqs. (3.6-4) and (3.6-5). As such, they yield equiva-
lent results, but the difference in sign must be kept in mind when combining (by
addition or subtraction) a Laplacian-filtered image with another image.
Because the Laplacian is a derivative operator, its use highlights intensity
discontinuities in an image and deemphasizes regions with slowly varying in-
tensity levels. This will tend to produce images that have grayish edge lines and
other discontinuities, all superimposed on a dark, featureless background.
Background features can be “recovered” while still preserving the sharpening
a b
0 1 0 1 1 1 c d
FIGURE 3.37
(a) Filter mask used
1 4 1 1 8 1
to implement
Eq. (3.6-6).
(b) Mask used to
implement an
0 1 0 1 1 1 extension of this
equation that
includes the
diagonal terms.
0 1 0 1 1 1 (c) and (d) Two
other implementa-
tions of the
Laplacian found
1 4 1 1 8 1
frequently in
practice.
0 1 0 1 1 1
162 Chapter 3 ■ Intensity Transformations and Spatial Filtering
effect of the Laplacian simply by adding the Laplacian image to the original.
As noted in the previous paragraph, it is important to keep in mind which def-
inition of the Laplacian is used. If the definition used has a negative center co-
efficient, then we subtract, rather than add, the Laplacian image to obtain a
sharpened result. Thus, the basic way in which we use the Laplacian for image
sharpening is
g(x, y) = f(x, y) + c C §2f(x, y) D (3.6-7)
where f(x, y) and g(x, y) are the input and sharpened images, respectively.
The constant is c = -1 if the Laplacian filters in Fig. 3.37(a) or (b) are used,
and c = 1 if either of the other two filters is used.
EXAMPLE 3.15: ■ Figure 3.38(a) shows a slightly blurred image of the North Pole of the
Image sharpening moon. Figure 3.38(b) shows the result of filtering this image with the Lapla-
using the
cian mask in Fig. 3.37(a). Large sections of this image are black because the
Laplacian.
Laplacian contains both positive and negative values, and all negative values
are clipped at 0 by the display.
A typical way to scale a Laplacian image is to add to it its minimum value to
bring the new minimum to zero and then scale the result to the full [0, L - 1]
intensity range, as explained in Eqs. (2.6-10) and (2.6-11). The image in
Fig. 3.38(c) was scaled in this manner. Note that the dominant features of the
image are edges and sharp intensity discontinuities. The background, previously
black, is now gray due to scaling. This grayish appearance is typical of Laplacian
images that have been scaled properly. Figure 3.38(d) shows the result obtained
using Eq. (3.6-7) with c = -1. The detail in this image is unmistakably clearer
and sharper than in the original image. Adding the original image to the Lapla-
cian restored the overall intensity variations in the image, with the Laplacian in-
creasing the contrast at the locations of intensity discontinuities.The net result is
an image in which small details were enhanced and the background tonality was
reasonably preserved. Finally, Fig. 3.38(e) shows the result of repeating the pre-
ceding procedure with the filter in Fig. 3.37(b). Here, we note a significant im-
provement in sharpness over Fig. 3.38(d). This is not unexpected because using
the filter in Fig. 3.37(b) provides additional differentiation (sharpening) in the
diagonal directions. Results such as those in Figs. 3.38(d) and (e) have made the
Laplacian a tool of choice for sharpening digital images. ■
a
b c
d e
FIGURE 3.38
(a) Blurred image
of the North Pole
of the moon.
(b) Laplacian
without scaling.
(c) Laplacian with
scaling. (d) Image
sharpened using
the mask in Fig.
3.37(a). (e) Result
of using the mask
in Fig. 3.37(b).
(Original image
courtesy of
NASA.)
-
Letting f (x, y) denote the blurred image, unsharp masking is expressed in
equation form as follows. First we obtain the mask:
Then we add a weighted portion of the mask back to the original image:
a
b
c Original signal
d
FIGURE 3.39 1-D
illustration of the
mechanics of
unsharp masking. Blurred signal
(a) Original
signal. (b) Blurred
signal with
original shown
Unsharp mask
dashed for refere-
nce. (c) Unsharp
mask. (d) Sharp-
ened signal,
obtained by
adding (c) to (a).
Sharpened signal
EXAMPLE 3.16: ■ Figure 3.40(a) shows a slightly blurred image of white text on a dark gray
Image sharpening background. Figure 3.40(b) was obtained using a Gaussian smoothing filter
using unsharp
(see Section 3.4.4) of size 5 * 5 with s = 3. Figure 3.40(c) is the unsharp
masking.
mask, obtained using Eq. (3.6-8). Figure 3.40(d) was obtained using unsharp
3.6 ■ Sharpening Spatial Filters 165
a
b
c
d
e
FIGURE 3.40
(a) Original
image.
(b) Result of
blurring with a
Gaussian filter.
(c) Unsharp
mask. (d) Result
of using unsharp
masking.
(e) Result of
using highboost
filtering.
masking [Eq. (3.6-9) with k = 1]. This image is a slight improvement over the
original, but we can do better. Figure 3.40(e) shows the result of using Eq. (3.6-9)
with k = 4.5, the largest possible value we could use and still keep positive all the
values in the final result. The improvement in this image over the original is
significant. ■
is the value at (x, y) of the rate of change in the direction of the gradient vec-
tor. Note that M(x, y) is an image of the same size as the original, created when
x and y are allowed to vary over all pixel locations in f. It is common practice
to refer to this image as the gradient image (or simply as the gradient when the
meaning is clear).
166 Chapter 3 ■ Intensity Transformations and Spatial Filtering
Because the components of the gradient vector are derivatives, they are lin-
ear operators. However, the magnitude of this vector is not because of the
squaring and square root operations. On the other hand, the partial derivatives
in Eq. (3.6-10) are not rotation invariant (isotropic), but the magnitude of the
gradient vector is. In some implementations, it is more suitable computational-
ly to approximate the squares and square root operations by absolute values:
M(x, y) L ƒ gx ƒ + ƒ gy ƒ (3.6-12)
This expression still preserves the relative changes in intensity, but the isotropic
property is lost in general. However, as in the case of the Laplacian, the isotrop-
ic properties of the discrete gradient defined in the following paragraph are pre-
served only for a limited number of rotational increments that depend on the
filter masks used to approximate the derivatives. As it turns out, the most popu-
lar masks used to approximate the gradient are isotropic at multiples of 90°.
These results are independent of whether we use Eq. (3.6-11) or (3.6-12), so
nothing of significance is lost in using the latter equation if we choose to do so.
As in the case of the Laplacian, we now define discrete approximations to
the preceding equations and from there formulate the appropriate filter
masks. In order to simplify the discussion that follows, we will use the notation
in Fig. 3.41(a) to denote the intensities of image points in a 3 * 3 region. For
a
b c z1 z2 z3
d e
FIGURE 3.41
A 3 * 3 region of
an image (the zs z4 z5 z6
are intensity
values).
(b)–(c) Roberts z7 z8 z9
cross gradient
operators.
(d)–(e) Sobel
operators. All the 1 0 0 1
mask coefficients
sum to zero, as
expected of a
derivative 0 1 1 0
operator.
1 2 1 1 0 1
0 0 0 2 0 2
1 2 1 1 0 1
3.6 ■ Sharpening Spatial Filters 167
example, the center point, z5, denotes f(x, y) at an arbitrary location, (x, y); z1
denotes f(x - 1, y - 1); and so on, using the notation introduced in Fig. 3.28.
As indicated in Section 3.6.1, the simplest approximations to a first-order de-
rivative that satisfy the conditions stated in that section are gx = (z8 - z5) and
gy = (z6 - z5). Two other definitions proposed by Roberts [1965] in the early
development of digital image processing use cross differences:
M(x, y) L ƒ z9 - z5 ƒ + ƒ z8 - z6 ƒ (3.6-15)
where it is understood that x and y vary over the dimensions of the image in
the manner described earlier. The partial derivative terms needed in equation
(3.6-13) can be implemented using the two linear filter masks in Figs. 3.41(b)
and (c). These masks are referred to as the Roberts cross-gradient operators.
Masks of even sizes are awkward to implement because they do not have a
center of symmetry. The smallest filter masks in which we are interested are of
size 3 * 3. Approximations to gx and gy using a 3 * 3 neighborhood centered
on z5 are as follows:
0f
gx = = (z7 + 2z8 + z9) - (z1 + 2z2 + z3) (3.6-16)
0x
and
0f
gy = = (z3 + 2z6 + z9) - (z1 + 2z4 + z7) (3.6-17)
0y
These equations can be implemented using the masks in Figs. 3.41(d) and (e).
The difference between the third and first rows of the 3 * 3 image region im-
plemented by the mask in Fig. 3.41(d) approximates the partial derivative in
the x-direction, and the difference between the third and first columns in the
other mask approximates the derivative in the y-direction. After computing
the partial derivatives with these masks, we obtain the magnitude of the gradi-
ent as before. For example, substituting gx and gy into Eq. (3.6-12) yields
The masks in Figs. 3.41(d) and (e) are called the Sobel operators. The idea be-
hind using a weight value of 2 in the center coefficient is to achieve some
smoothing by giving more importance to the center point (we discuss this in
more detail in Chapter 10). Note that the coefficients in all the masks shown in
Fig. 3.41 sum to 0, indicating that they would give a response of 0 in an area of
constant intensity, as is expected of a derivative operator.
168 Chapter 3 ■ Intensity Transformations and Spatial Filtering
EXAMPLE 3.17: ■ The gradient is used frequently in industrial inspection, either to aid hu-
Use of the mans in the detection of defects or, what is more common, as a preprocessing
gradient for edge
enhancement. step in automated inspection. We will have more to say about this in Chapters
10 and 11. However, it will be instructive at this point to consider a simple ex-
ample to show how the gradient can be used to enhance defects and eliminate
slowly changing background features. In this example, enhancement is used as
a preprocessing step for automated inspection, rather than for human analysis.
Figure 3.42(a) shows an optical image of a contact lens, illuminated by a
lighting arrangement designed to highlight imperfections, such as the two edge
defects in the lens boundary seen at 4 and 5 o’clock. Figure 3.42(b) shows the
gradient obtained using Eq. (3.6-12) with the two Sobel masks in Figs. 3.41(d)
and (e). The edge defects also are quite visible in this image, but with the
added advantage that constant or slowly varying shades of gray have been
eliminated, thus simplifying considerably the computational task required for
automated inspection. The gradient can be used also to highlight small specs
that may not be readily visible in a gray-scale image (specs like these can be
foreign matter, air pockets in a supporting solution, or miniscule imperfections
in the lens). The ability to enhance small discontinuities in an otherwise flat
gray field is another important feature of the gradient. ■
a b
FIGURE 3.42
(a) Optical image
of contact lens
(note defects on
the boundary at 4
and 5 o’clock).
(b) Sobel
gradient.
(Original image
courtesy of Pete
Sites, Perceptics
Corporation.)
202 Chapter 4 ■ Filtering in the Frequency Domain
examples from image enhancement in this chapter not only saves having an
extra chapter in the book but, more importantly, is an effective tool for intro-
ducing newcomers to filtering techniques in the frequency domain. We use
frequency domain processing methods for other applications in Chapters 5, 8,
10, and 11.
where ƒ C ƒ and u are as defined above. For example, the polar representation of
the complex number 1 + j2 is 13e ju, where u = 64.4° or 1.1 radians. The pre-
ceding equations are applicable also to complex functions. For example, a
complex function, F(u), of a variable u, can be expressed as the sum
F(u) = R(u) + jI(u), where R(u) and I(u) are the real and imaginary compo-
nent functions. As previously noted, the complex conjugate is F*(u)
= R(u) - jI(u), the magnitude is ƒ F(u) ƒ = 2R(u)2 + I(u)2, and the angle is
u(u) = arctan[I(u)>R(u)]. We return to complex functions several times in the
course of this and the next chapter.
where
T>2
1 2pn
cn = f(t) e-j T t dt for n = 0, ;1, ;2, Á (4.2-7)
T L-T>2
are the coefficients. The fact that Eq. (4.2-6) is an expansion of sines and
cosines follows from Euler’s formula, Eq. (4.2-4). We will return to the Fourier
series later in this section.
of the sifting property involves an impulse located at an arbitrary point t0, denot-
ed by d(t - t0). In this case, the sifting property becomes
q
f(t) d(t - t0) dt = f(t0) (4.2-10)
L- q
which yields the value of the function at the impulse location, t0. For instance,
if f(t) = cos(t), using the impulse d(t - p) in Eq. (4.2-10) yields the result
f(p) = cos(p) = -1. The power of the sifting concept will become quite evi-
dent shortly.
Let x represent a discrete variable. The unit discrete impulse, d(x), serves the
same purposes in the context of discrete systems as the impulse d(t) does when
working with continuous variables. It is defined as
1 x = 0
d(x) = b (4.2-11a)
0 x Z 0
Clearly, this definition also satisfies the discrete equivalent of Eq. (4.2-8b):
q
a d(x) = 1 (4.2-11b)
x = -q
As before, we see that the sifting property simply yields the value of the func-
tion at the location of the impulse. Figure 4.2 shows the unit discrete impulse
diagrammatically. Unlike its continuous counterpart, the discrete impulse is an
ordinary function.
Of particular interest later in this section is an impulse train, s¢T(t), defined
as the sum of infinitely many periodic impulses ¢T units apart:
q
s¢T(t) = a d(t - n¢T) (4.2-14)
q n=-
FIGURE 4.2
d(x x0)
A unit discrete
1
impulse located at
x = x0. Variable x
is discrete, and d
is 0 everywhere
except at x = x0.
x
0 x0
4.2 ■ Preliminary Concepts 205
... ...
t
. . . 3T 2T T 0 T 2T 3T . . .
Figure 4.3 shows an impulse train. The impulses can be continuous or discrete.
†
Conditions for the existence of the Fourier transform are complicated to state in general (Champeney
[1987]), but a sufficient condition for its existence is that the integral of the absolute value of f(t), or the
integral of the square of f(t), be finite. Existence is seldom an issue in practice, except for idealized sig-
nals, such as sinusoids that extend forever. These are handled using generalized impulse functions. Our
primary interest is in the discrete Fourier transform pair which, as you will see shortly, is guaranteed to
exist for all finite functions.
206 Chapter 4 ■ Filtering in the Frequency Domain
If f(t) is real, we see that its transform in general is complex. Note that the
Fourier transform is an expansion of f(t) multiplied by sinusoidal terms whose
frequencies are determined by the values of m (variable t is integrated out, as
mentioned earlier). Because the only variable left after integration is frequen-
cy, we say that the domain of the Fourier transform is the frequency domain.
We discuss the frequency domain and its properties in more detail later in this
For consistency in termi- chapter. In our discussion, t can represent any continuous variable, and the
nology used in the previ-
ous two chapters, and to units of the frequency variable m depend on the units of t. For example, if t rep-
be used later in this resents time in seconds, the units of m are cycles/sec or Hertz (Hz). If t repre-
chapter in connection
with images, we refer to sents distance in meters, then the units of m are cycles/meter, and so on. In
the domain of variable t other words, the units of the frequency domain are cycles per unit of the inde-
in general as the spatial
domain. pendent variable of the input function.
EXAMPLE 4.1: ■ The Fourier transform of the function in Fig. 4.4(a) follows from Eq. (4.2-16):
Obtaining the
Fourier transform q W/2
of a simple F(m) = f(t) e -j2pmt dt = Ae -j2pmt dt
function. L- q L-W/2
Ce D -W/2 = Ce - e jpmW D
-A -j2pmt W/2 -A -jpmW
=
j2pm j2pm
C e jpmW - e-jpmW D
A
=
j2pm
sin(pmW)
= AW
(pmW)
f(t) F( m) F ( m)
AW AW
1/W 1/W
t m m
W/ 2 0 W/ 2 0 . . . 2/W 0
. . . 2/W 2/W . . . 2/W . . .
1/W 1/W
a b c
FIGURE 4.4 (a) A simple function; (b) its Fourier transform; and (c) the spectrum. All functions extend to
infinity in both directions.
4.2 ■ Preliminary Concepts 207
function. The result in the last step of the preceding expression is known as the
sinc function:
sin(pm)
sinc(m) = (4.2-19)
(pm)
where sinc(0) = 1, and sinc(m) = 0 for all other integer values of m. Figure 4.4(b)
shows a plot of F(m).
In general, the Fourier transform contains complex terms, and it is custom-
ary for display purposes to work with the magnitude of the transform (a real
quantity), which is called the Fourier spectrum or the frequency spectrum:
ƒ F(m) ƒ = AT ` `
sin(pmW)
(pmW)
Figure 4.4(c) shows a plot of ƒ F(m) ƒ as a function of frequency. The key prop-
erties to note are that the locations of the zeros of both F(m) and ƒ F(m) ƒ are
inversely proportional to the width, W, of the “box” function, that the height of
the lobes decreases as a function of distance from the origin, and that the func-
tion extends to infinity for both positive and negative values of m. As you will
see later, these properties are quite helpful in interpreting the spectra of two-
dimensional Fourier transforms of images. ■
■ The Fourier transform of a unit impulse located at the origin follows from EXAMPLE 4.2:
Eq. (4.2-16): Fourier transform
of an impulse and
q
of an impulse
F(m) = d(t) e -j2pmt dt train.
L- q
q
= e -j2pmt d(t) dt
L- q
= e-j2pm0 = e0
= 1
where the third step follows from the sifting property in Eq. (4.2-9). Thus, we
see that the Fourier transform of an impulse located at the origin of the spatial
domain is a constant in the frequency domain. Similarly, the Fourier transform
of an impulse located at t = t0 is
q
F(m) = d(t - t0) e -j2pmt dt
L- q
q
= e -j2pmt d(t - t0) dt
L- q
= e-j2pmt0
= cos(2pmt0) - j sin(2pmt0)
208 Chapter 4 ■ Filtering in the Frequency Domain
where the third line follows from the sifting property in Eq. (4.2-10) and the
last line follows from Euler’s formula. These last two lines are equivalent rep-
resentations of a unit circle centered on the origin of the complex plane.
In Section 4.3, we make use of the Fourier transform of a periodic im-
pulse train. Obtaining this transform is not as straightforward as we just
showed for individual impulses. However, understanding how to derive the
transform of an impulse train is quite important, so we take the time to de-
rive it in detail here. We start by noting that the only difference in the form
of Eqs. (4.2-16) and (4.2-17) is the sign of the exponential. Thus, if a function
f(t) has the Fourier transform F(m), then the latter function evaluated at t,
that is, F(t), must have the transform f( -m). Using this symmetry property
and given, as we showed above, that the Fourier transform of an impulse
d(t - t0) is e-j2pmt0, it follows that the function e -j2pt0 t has the transform
d(-m - t0). By letting -t0 = a, it follows that the transform of e j2pat is
d(-m + a) = d(m - a), where the last step is true because d is not zero only
when m = a, which is the same result for either d(-m + a) or d(m - a), so
the two forms are equivalent.
The impulse train s¢T(t) in Eq. (4.2-14) is periodic with period ¢T, so we
know from Section 4.2.2 that it can be expressed as a Fourier series:
q 2pn
s¢T (t) = a cn e j ¢T t
n = -q
where
¢T>2
1 2pn
cn = s¢T (t) e -j ¢T t dt
¢T L-¢T>2
With reference to Fig. 4.3, we see that the integral in the interval
[- ¢T>2, ¢T>2] encompasses only the impulse of s¢T(t) that is located at the
origin. Therefore, the preceding equation becomes
¢T>2
1 2pn
cn = d(t) e -j ¢T t dt
¢T L-¢T>2
1 0
= e
¢T
1
=
¢T
1 q j 2pnt
¢T n =a
s¢T (t) = e ¢T
-q
the same as obtaining the sum of the transforms of the individual compo-
nents. These components are exponentials, and we established earlier in this
example that
ᑣ E e j ¢T t F = d ¢m -
2pn n
≤
¢T
So, S(m), the Fourier transform of the periodic impulse train s¢T (t), is
1 q j 2pnt
= ᑣb e ¢T r
¢T n =a
-q
1 q 2pn
= ᑣ b a e j ¢T t r
¢T n = -q
b
1 q n
¢T n =a
= dam -
-q ¢T
This fundamental result tells us that the Fourier transform of an impulse train
with period ¢T is also an impulse train, whose period is 1>¢T. This inverse
proportionality between the periods of s¢T (t) and S(m) is analogous to what
we found in Fig. 4.4 in connection with a box function and its transform. This
property plays a fundamental role in the remainder of this chapter. ■
4.2.5 Convolution
We need one more building block before proceeding. We introduced the idea
of convolution in Section 3.4.2. You learned in that section that convolution of
two functions involves flipping (rotating by 180°) one function about its origin
and sliding it past the other. At each displacement in the sliding process, we
perform a computation, which in the case of Chapter 3 was a sum of products.
In the present discussion, we are interested in the convolution of two continu-
ous functions, f(t) and h(t), of one continuous variable, t, so we have to use in-
tegration instead of a summation. The convolution of these two functions,
denoted as before by the operator , is defined as
q
f(t) h(t) = f(t) h(t - t) dt (4.2-20)
L- q
where the minus sign accounts for the flipping just mentioned, t is the
displacement needed to slide one function past the other, and t is a dummy
variable that is integrated out. We assume for now that the functions extend
from - q to q .
We illustrated the basic mechanics of convolution in Section 3.4.2, and we
will do so again later in this chapter and in Chapter 5. At the moment, we are
210 Chapter 4 ■ Filtering in the Frequency Domain
ᑣ E f(t) h(t) F =
q q
B f(t) h(t - t) dt R e -j2pmt dt
L- q L- q
q q
= f(t) B h(t - t) e -j2pmt dt R dt
L- q L- q
The term inside the brackets is the Fourier transform of h(t - t). We show
later in this chapter that ᑣ5h(t - t)6 = H(m)e-j2pmt, where H(m) is the
Fourier transform of h(t). Using this fact in the preceding equation gives us
= H(m) F(m)
Recalling from Section 4.2.4 that we refer to the domain of t as the spatial do-
main, and the domain of m as the frequency domain, the preceding equation
tells us that the Fourier transform of the convolution of two functions in the
spatial domain is equal to the product in the frequency domain of the Fourier
transforms of the two functions. Conversely, if we have the product of the two
transforms, we can obtain the convolution in the spatial domain by computing
the inverse Fourier transform. In other words, f(t) h(t) and H(u) F(u) are a
Fourier transform pair. This result is one-half of the convolution theorem and
is written as
The double arrow is used to indicate that the expression on the right is ob-
tained by taking the Fourier transform of the expression on the left, while the
expression on the left is obtained by taking the inverse Fourier transform of
the expression on the right.
Following a similar development would result in the other half of the con-
volution theorem:
q if t = z = 0
d(t, z) = b (4.5-1a)
0 otherwise
and
q q
d(t, z) dt dz = 1 (4.5-1b)
L- q L- q
As in the 1-D case, the 2-D impulse exhibits the sifting property under
integration,
q q
f(t, z) d(t, z) dt dz = f(0, 0) (4.5-2)
L- q L- q
or, more generally for an impulse located at coordinates (t0, z0),
q q
f(t, z) d(t - t0, z - z0) dt dz = f(t0, z0) (4.5-3)
L- q L- q
As before, we see that the sifting property yields the value of the function
f(t, z) at the location of the impulse.
For discrete variables x and y, the 2-D discrete impulse is defined as
1 if x = y = 0
d(x, y) = b (4.5-4)
0 otherwise
As before, the sifting property of a discrete impulse yields the value of the dis-
crete function f(x, y) at the location of the impulse.
226 Chapter 4 ■ Filtering in the Frequency Domain
and
q q
f(t, z) = F(m, n) e j2p(mt + nz) dm dn (4.5-8)
L- q L- q
where m and n are the frequency variables. When referring to images, t and z
are interpreted to be continuous spatial variables. As in the 1-D case, the do-
main of the variables m and n defines the continuous frequency domain.
EXAMPLE 4.5: ■ Figure 4.13(a) shows a 2-D function analogous to the 1-D case in Example 4.1.
Obtaining the 2-D Following a procedure similar to the one used in that example gives the result
Fourier transform
of a simple q q
function. F(m, n) = f(t, z) e -j 2p(mt + nz) dt dz
L- q L- q
T> 2 Z> 2
= A e -j 2p(mt + nz) dt dz
L-T> 2 L-Z> 2
sin(pmT) sin(pnZ)
= ATZ B RB R
(pmT) (pnZ)
ƒ F(m, n) ƒ = ATZ ` `` `
sin(pmT) sin(pnZ)
(pmT) (pnZ)
Figure 4.13(b) shows a portion of the spectrum about the origin. As in the 1-D
case, the locations of the zeros in the spectrum are inversely proportional to
4.5 ■ Extension to Functions of Two Variables 227
F( m, n)
f(t, z) ATZ
T Z
A
t T/2 Z/2 z
m n
a b
FIGURE 4.13 (a) A 2-D function, and (b) a section of its spectrum (not to scale). The
block is longer along the t-axis, so the spectrum is more “contracted” along the m-axis.
Compare with Fig. 4.4.
the values of T and Z. Thus, the larger T and Z are, the more “contracted” the
spectrum will become, and vice versa. ■
q q
s¢T¢Z (t, z) = aq aqd(t - m¢T, z - n¢Z) (4.5-9)
m=- n=-
where ¢T and ¢Z are the separations between samples along the t- and z-axis
of the continuous function f(t, z). Equation (4.5-9) describes a set of periodic
impulses extending infinitely along the two axes (Fig. 4.14). As in the 1-D case
illustrated in Fig. 4.5, multiplying f(t, z) by s¢T¢Z (t, z) yields the sampled
function.
Function f(t, z) is said to be band-limited if its Fourier transform is 0 out-
side a rectangle established by the intervals [-mmax, mmax] and [-nmax, nmax];
that is,
1
¢T 6 (4.5-11)
2mmax
and
1
¢Z 6 (4.5-12)
2nmax
or, expressed in terms of the sampling rate, if
228 Chapter 4 ■ Filtering in the Frequency Domain
... ...
t z
... Z T ...
1
7 2mmax (4.5-13)
¢T
and
1
7 2nmax (4.5-14)
¢Z
Stated another way, we say that no information is lost if a 2-D, band-limited, con-
tinuous function is represented by samples acquired at rates greater than twice
the highest frequency content of the function in both the m- and n-directions.
Figure 4.15 shows the 2-D equivalents of Figs. 4.6(b) and (d). A 2-D ideal box
filter has the form illustrated in Fig. 4.13(a). The dashed portion of Fig. 4.15(a)
shows the location of the filter to achieve the necessary isolation of a single pe-
riod of the transform for reconstruction of a band-limited function from its sam-
ples, as in Section 4.3.3. From Section 4.3.4, we know that if the function is
under-sampled the periods overlap, and it becomes impossible to isolate a single
period, as Fig. 4.15(b) shows. Aliasing would result under such conditions.
a b
Footprint of an
FIGURE 4.15 ideal lowpass
Two-dimensional (box) filter
Fourier transforms
of (a) an over-
sampled, and v v
(b) under-sampled
band-limited
function. m max vmax
m m
4.5 ■ Extension to Functions of Two Variables 229
■ Suppose that we have an imaging system that is perfect, in the sense that it EXAMPLE 4.6:
is noiseless and produces an exact digital image of what it sees, but the number Aliasing in
images.
of samples it can take is fixed at 96 * 96 pixels. If we use this system to digitize
checkerboard patterns, it will be able to resolve patterns that are up to This example should not
be construed as being un-
96 * 96 squares, in which the size of each square is 1 * 1 pixels. In this limit- realistic. Sampling a
ing case, each pixel in the resulting image will correspond to one square in the “perfect” scene under
noiseless, distortion-free
pattern. We are interested in examining what happens when the detail (the conditions is common
size of the checkerboard squares) is less than one camera pixel; that is, when when converting computer-
generated models and
the imaging system is asked to digitize checkerboard patterns that have more vector drawings to digital
than 96 * 96 squares in the field of view. images.
Figures 4.16(a) and (b) show the result of sampling checkerboards whose
squares are of size 16 and 6 pixels on the side, respectively. These results are as
expected. However, when the size of the squares is reduced to slightly less than
one camera pixel a severely aliased image results, as Fig. 4.16(c) shows. Finally,
reducing the size of the squares to slightly less than 0.5 pixels on the side yielded
the image in Fig. 4.16(d). In this case, the aliased result looks like a normal
checkerboard pattern. In fact, this image would result from sampling a checker-
board image whose squares were 12 pixels on the side. This last image is a good
reminder that aliasing can create results that may be quite misleading. ■
a b
c d
FIGURE 4.16 Aliasing in images. In (a) and (b), the lengths of the sides of the squares
are 16 and 6 pixels, respectively, and aliasing is visually negligible. In (c) and (d), the
sides of the squares are 0.9174 and 0.4798 pixels, respectively, and the results show
significant aliasing. Note that (d) masquerades as a “normal” image.
and 4.8, this term is related to blurring a digital image to reduce additional
aliasing artifacts caused by resampling. The term does not apply to reducing
aliasing in the original sampled image. A significant number of commercial
digital cameras have true anti-aliasing filtering built in, either in the lens or on
the surface of the sensor itself. For this reason, it is difficult to illustrate alias-
ing using images obtained with such cameras.
instance, to double the size of an image, we duplicate each column. This dou-
bles the image size in the horizontal direction. Then, we duplicate each row of
the enlarged image to double the size in the vertical direction. The same pro-
cedure is used to enlarge the image any integer number of times. The intensity-
level assignment of each pixel is predetermined by the fact that new locations
are exact duplicates of old locations.
Image shrinking is done in a manner similar to zooming. Under-sampling is
achieved by row-column deletion (e.g., to shrink an image by one-half, we
delete every other row and column). We can use the zooming grid analogy in
Section 2.4.4 to visualize the concept of shrinking by a non-integer factor, ex-
cept that we now expand the grid to fit over the original image, do intensity-
level interpolation, and then shrink the grid back to its specified size. To reduce
aliasing, it is a good idea to blur an image slightly before shrinking it (we discuss The process of resam-
pling an image without
frequency domain blurring in Section 4.8). An alternate technique is to super- using band-limiting blur-
sample the original scene and then reduce (resample) its size by row and col- ring is called decimation.
umn deletion. This can yield sharper results than with smoothing, but it clearly
requires access to the original scene. Clearly, if we have no access to the original
scene (as typically is the case in practice) super-sampling is not an option.
■ The effects of aliasing generally are worsened when the size of a digital EXAMPLE 4.7:
image is reduced. Figure 4.17(a) is an image purposely created to illustrate the Illustration of
aliasing in
effects of aliasing (note the thinly-spaced parallel lines in all garments worn by
resampled images.
the subject). There are no objectionable artifacts in Fig. 4.17(a), indicating that
a b c
FIGURE 4.17 Illustration of aliasing on resampled images. (a) A digital image with negligible visual aliasing.
(b) Result of resizing the image to 50% of its original size by pixel deletion. Aliasing is clearly visible.
(c) Result of blurring the image in (a) with a 3 * 3 averaging filter prior to resizing. The image is slightly
more blurred than (b), but aliasing is not longer objectionable. (Original image courtesy of the Signal
Compression Laboratory, University of California, Santa Barbara.)
232 Chapter 4 ■ Filtering in the Frequency Domain
the sampling rate used initially was sufficient to avoid visible aliasing. In
Fig. 4.17(b), the image was reduced to 50% of its original size using row-
column deletion. The effects of aliasing are quite visible in this image (see,
for example the areas around the subject’s knees). The digital “equivalent”
of anti-aliasing filtering of continuous images is to attenuate the high fre-
quencies of a digital image by smoothing it before resampling. Figure
4.17(c) shows the result of smoothing the image in Fig. 4.17(a) with a 3 * 3
averaging filter (see Section 3.5) before reducing its size. The improvement
over Fig. 4.17(b) is evident. Images (b) and (c) were resized up to their orig-
inal dimension by pixel replication to simplify comparisons. ■
When you work with images that have strong edge content, the effects of
aliasing are seen as block-like image components, called jaggies. The following
example illustrates this phenomenon.
EXAMPLE 4.8: ■ Figure 4.18(a) shows a 1024 * 1024 digital image of a computer-generated
Illustration of scene in which aliasing is negligible. Figure 4.18(b) is the result of reducing
jaggies in image
the size of (a) by 75% to 256 * 256 pixels using bilinear interpolation and
shrinking.
then using pixel replication to bring the image back to its original size in
order to make the effects of aliasing (jaggies in this case) more visible. As in
Example 4.7, the effects of aliasing can be made less objectionable by
smoothing the image before resampling. Figure 4.18(c) is the result of using a
5 * 5 averaging filter prior to reducing the size of the image. As this figure
shows, jaggies were reduced significantly. The size reduction and increase to
the original size in Fig. 4.18(c) were done using the same approach used to
generate Fig. 4.18(b). ■
a b c
FIGURE 4.18 Illustration of jaggies. (a) A 1024 * 1024 digital image of a computer-generated scene with
negligible visible aliasing. (b) Result of reducing (a) to 25% of its original size using bilinear interpolation.
(c) Result of blurring the image in (a) with a 5 * 5 averaging filter prior to resizing it to 25% using bilinear
interpolation. (Original image courtesy of D. P. Mitchell, Mental Landscape, LLC.)
4.5 ■ Extension to Functions of Two Variables 233
■ In the previous two examples, we used pixel replication to zoom the small EXAMPLE 4.9:
resampled images. This is not a preferred approach in general, as Fig. 4.19 il- Illustration of
jaggies in image
lustrates. Figure 4.19(a) shows a 1024 * 1024 zoomed image generated by
zooming.
pixel replication from a 256 * 256 section out of the center of the image in
Fig. 4.18(a). Note the “blocky” edges. The zoomed image in Fig. 4.19(b) was
generated from the same 256 * 256 section, but using bilinear interpolation.
The edges in this result are considerably smoother. For example, the edges of
the bottle neck and the large checkerboard squares are not nearly as blocky
in (b) as they are in (a). ■
Moiré patterns
Before leaving this section, we examine another type of artifact, called moiré
patterns,† that sometimes result from sampling scenes with periodic or nearly
periodic components. In optics, moiré patterns refer to beat patterns pro-
duced between two gratings of approximately equal spacing. These patterns
are a common everyday occurrence. We see them, for example, in overlapping
insect window screens and on the interference between TV raster lines and
striped materials. In digital image processing, the problem arises routinely
when scanning media print, such as newspapers and magazines, or in images
with periodic components whose spacing is comparable to the spacing be-
tween samples. It is important to note that moiré patterns are more general
than sampling artifacts. For instance, Fig. 4.20 shows the moiré effect using ink
drawings that have not been digitized. Separately, the patterns are clean and
void of interference. However, superimposing one pattern on the other creates
a b
FIGURE 4.19 Image zooming. (a) A 1024 * 1024 digital image generated by pixel
replication from a 256 * 256 image extracted from the middle of Fig. 4.18(a).
(b) Image generated using bi-linear interpolation, showing a significant reduction in
jaggies.
†
The term moiré is a French word (not the name of a person) that appears to have originated with
weavers who first noticed interference patterns visible on some fabrics; the term is rooted on the word
mohair, a cloth made from Angola goat hairs.
234 Chapter 4 ■ Filtering in the Frequency Domain
a b c
d e f
FIGURE 4.20
Examples of the
moiré effect.
These are ink
drawings, not
digitized patterns.
Superimposing
one pattern on
the other is
equivalent
mathematically to
multiplying the
patterns.
a beat pattern that has frequencies not present in either of the original pat-
terns. Note in particular the moiré effect produced by two patterns of dots, as
this is the effect of interest in the following discussion.
Color printing uses red, Newspapers and other printed materials make use of so called halftone
green, and blue dots to
produce the sensation in
dots, which are black dots or ellipses whose sizes and various joining schemes
the eye of continuous are used to simulate gray tones. As a rule, the following numbers are typical:
color.
newspapers are printed using 75 halftone dots per inch (dpi for short), maga-
zines use 133 dpi, and high-quality brochures use 175 dpi. Figure 4.21 shows
FIGURE 4.21
A newspaper
image of size
246 * 168 pixels
sampled at 75 dpi
showing a moiré
pattern. The
moiré pattern in
this image is the
interference
pattern created
between the ;45°
orientation of the
halftone dots and
the north–south
orientation of the
sampling grid
used to digitize
the image.
4.5 ■ Extension to Functions of Two Variables 235
what happens when a newspaper image is sampled at 75 dpi. The sampling lat-
tice (which is oriented vertically and horizontally) and dot patterns on the
newspaper image (oriented at ;45°) interact to create a uniform moiré pat-
tern that makes the image look blotchy. (We discuss a technique in Section
4.10.2 for reducing moiré interference patterns.)
As a related point of interest, Fig. 4.22 shows a newspaper image sam-
pled at 400 dpi to avoid moiré effects. The enlargement of the region sur-
rounding the subject’s left eye illustrates how halftone dots are used to
create shades of gray. The dot size is inversely proportional to image inten-
sity. In light areas, the dots are small or totally absent (see, for example, the
white part of the eye). In light gray areas, the dots are larger, as shown
below the eye. In darker areas, when dot size exceeds a specified value (typ-
ically 50%), dots are allowed to join along two specified directions to form
an interconnected mesh (see, for example, the left part of the eye). In some
cases the dots join along only one direction, as in the top right area below
the eyebrow.
FIGURE 4.22
A newspaper
image and an
enlargement
showing how
halftone dots are
arranged to
render shades of
gray.
†
As mentioned in Section 4.4.1, keep in mind that in this chapter we use (t, z) and (m, n) to denote 2-D
continuous spatial and frequency-domain variables. In the 2-D discrete case, we use (x, y) for spatial
variables and (u, v) for frequency-domain variables.
236 Chapter 4 ■ Filtering in the Frequency Domain
Given the transform F(u, v), we can obtain f(x, y) by using the inverse dis-
crete Fourier transform (IDFT):
1
¢u = (4.6-1)
M¢T
and
1
¢v = (4.6-2)
N¢Z
respectively. Note that the separations between samples in the frequency do-
main are inversely proportional both to the spacing between spatial samples
and the number of samples.
and
That is, multiplying f (x, y) by the exponential shown shifts the origin of the
DFT to (u0, v0) and, conversely, multiplying F (u, v) by the negative of that
exponential shifts the origin of f (x, y) to (x0, y0). As we illustrate in
Example 4.13, translation has no effect on the magnitude (spectrum) of
F (u, v).
Using the polar coordinates
which indicates that rotating f(x, y) by an angle u0 rotates F(u, v) by the same
angle. Conversely, rotating F(u, v) rotates f(x, y) by the same angle.
4.6.3 Periodicity
As in the 1-D case, the 2-D Fourier transform and its inverse are infinitely pe-
riodic in the u and v directions; that is,
and
In other words, multiplying f(x) by the exponential term shown shifts the data
so that the origin, F(0), is located at u0. If we let u0 = M>2, the exponential
term becomes e jpx which is equal to ( -1)x because x is an integer. In this case,
That is, multiplying f(x) by ( -1)x shifts the data so that F(0) is at the center of
the interval [0, M - 1], which corresponds to Fig. 4.23(b), as desired.
In 2-D the situation is more difficult to graph, but the principle is the same,
as Fig. 4.23(c) shows. Instead of two half periods, there are now four quarter
periods meeting at the point (M> 2, N> 2). The dashed rectangles correspond to
238 Chapter 4 ■ Filtering in the Frequency Domain
a F(u)
b
c d
FIGURE 4.23 Two back-to-back
Centering the periods meet here.
Fourier transform.
(a) A 1-D DFT u
showing an infinite M/ 2 0 M/ 2 1 M/ 2 M
number of periods. M1
(b) Shifted DFT F(u)
obtained by
multiplying f(x)
by (-1)x before Two back-to-back
computing F(u). periods meet here.
(c) A 2-D DFT
showing an infinite
number of periods. u
0 M/2 M1
The solid area is One period (M samples)
the M * N data
array, F(u, v),
obtained with Eq.
(4.5-15). This array
consists of four
quarter periods.
(d) A Shifted DFT (0, 0) N/2 N1
obtained by v
multiplying f(x, y)
by (-1)x + y F(u, v) M/ 2
before computing
F(u, v). The data Four back-to-back
now contains one periods meet here. M1 F (u, v)
complete, centered
period, as in (b). u
Four back-to-back
periods meet here.
Periods of the DFT.
the infinite number of periods of the 2-D DFT. As in the 1-D case, visualization
is simplified if we shift the data so that F(0, 0) is at (M> 2, N> 2). Letting
(u0, v0) = (M>2, N>2) in Eq. (4.6-3) results in the expression
Using this equation shifts the data so that F(0, 0) is at the center of the
frequency rectangle defined by the intervals [0, M - 1] and [0, N - 1], as
desired. Figure 4.23(d) shows the result. We illustrate these concepts later in
this section as part of Example 4.11 and Fig. 4.24.
4.6 ■ Some Properties of the 2-D Discrete Fourier Transform 239
Substituting Eqs. (4.6-10a) and (4.6-10b) into Eq. (4.6-9) gives the identity
w(x, y) K w(x, y), thus proving the validity of the latter equation. It follows
from the preceding definitions that
and that
Even functions are said to be symmetric and odd functions are antisymmetric.
Because all indices in the DFT and IDFT are positive, when we talk about
symmetry (antisymmetry) we are referring to symmetry (antisymmetry) about
the center point of a sequence. In terms of Eq. (4.6-11), indices to the right of
the center point of a 1-D array are considered positive, and those to the left
are considered negative (similarly in 2-D). In our work, it is more convenient
to think only in terms of nonnegative indices, in which case the definitions of
evenness and oddness become:
and
where, as usual, M and N are the number of rows and columns of a 2-D array.
240 Chapter 4 ■ Filtering in the Frequency Domain
for any two discrete even and odd functions we and wo. In other words, be-
cause the argument of Eq. (4.6-13) is odd, the result of the summations is 0.
The functions can be real or complex.
EXAMPLE 4.10: ■ Although evenness and oddness are visualized easily for continuous func-
Even and odd tions, these concepts are not as intuitive when dealing with discrete sequences.
functions.
The following illustrations will help clarify the preceding ideas. Consider the
1-D sequence
= E2 1 1 1F
Because f(4) is outside the range being examined, and it can be any value,
the value of f(0) is immaterial in the test for evenness. We see that the next
three conditions are satisfied by the values in the array, so the sequence is
even. In fact, we conclude that any 4-point even sequence has to have the
form
5a b c b6
That is, only the second and last points must be equal in a 4-point even se-
quence.
An odd sequence has the interesting property that its first term, w0(0, 0), is
always 0, a fact that follows directly from Eq. (4.6-10b). Consider the 1-D se-
quence
We easily can confirm that this is an odd sequence by noting that the terms in
the sequence satisfy the condition g(x) = -g(4 - x). For example,
g(1) = -g(3). Any 4-point odd sequence has the form
50 -b 0 b6
That is, when M is an even number, a 1-D odd sequence has the property that
the points at locations 0 and M> 2 always are zero. When M is odd, the first
term still has to be 0, but the remaining terms form pairs with equal value but
opposite sign.
The preceding discussion indicates that evenness and oddness of sequences
depend also on the length of the sequences. For example, we already showed
that the sequence 50 -1 0 16 is odd. However, the sequence
50 -1 0 1 06 is neither odd nor even, although the “basic” structure ap-
pears to be odd. This is an important issue in interpreting DFT results. We
show later in this section that the DFTs of even and odd functions have some
very important characteristics. Thus, it often is the case that understanding
when a function is odd or even plays a key role in our ability to interpret image
results based on DFTs.
The same basic considerations hold in 2-D. For example, the 6 * 6 2-D se-
quence
0 0 0 0 0 0
0 0 0 0 0 0 As an exercise, you
should use Eq. (4.6-12b)
0 0 -1 0 1 0 to convince yourself that
this 2-D sequence is odd.
0 0 -2 0 2 0
0 0 -1 0 1 0
0 0 0 0 0 0
is odd. However, adding another row and column of 0s would give a result
that is neither odd nor even. Note that the inner structure of this array is a
Sobel mask, as discussed in Section 3.6.4. We return to this mask in
Example 4.15. ■
If f(x, y) is imaginary, its Fourier transform is conjugate antisymmetric: Conjugate symmetry also
F*(-u, -v) = -F(u, v). The proof of Eq. (4.6-14) is as follows: is called hermitian sym-
metry. The term
antihermitian is used
M-1 N-1 * sometimes to refer to
x=0 y=0
242 Chapter 4 ■ Filtering in the Frequency Domain
M-1 N-1
= a a f*(x, y)e j 2 p(ux>M + vy>N)
x=0 y=0
M-1 N-1
= a a f(x, y)e -j 2p( [-u] x>M + [-v] y> N)
x=0 y=0
= F(-u, -v)
where the third step follows from the fact that f(x, y) is real. A similar ap-
proach can be used to prove the conjugate antisymmetry exhibited by the
transform of imaginary functions.
Table 4.1 lists symmetries and related properties of the DFT that are useful
in digital image processing. Recall that the double arrows indicate Fourier
transform pairs; that is, for any row in the table, the properties on the right are
satisfied by the Fourier transform of the function having the properties listed
on the left, and vice versa. For example, entry 5 reads: The DFT of a real
function f1x, y2, in which 1x, y2 is replaced by 1-x, -y2, is F *(u, v), where
F(u, v), the DFT of f1x, y2, is a complex function, and vice versa.
■ With reference to the even and odd concepts discussed earlier and illustrat- EXAMPLE 4.11:
ed in Example 4.10, the following 1-D sequences and their transforms are 1-D illustrations
short examples of the properties listed in Table 4.1. The numbers in parenthe- of properties from
Table 4.1.
ses on the right are the individual elements of F(u), and similarly for f(x) in
the last two properties.
■ In this example, we prove several of the properties in Table 4.1 to develop EXAMPLE 4.12:
familiarity with manipulating these important properties, and to establish a Proving several
basis for solving some of the problems at the end of the chapter. We prove only symmetry
properties of the
the properties on the right given the properties on the left. The converse is DFT from Table
proved in a manner similar to the proofs we give here. 4.1.
Consider property 3, which reads: If f(x, y) is a real function, the real part of
its DFT is even and the odd part is odd; similarly, if a DFT has real and
imaginary parts that are even and odd, respectively, then its IDFT is a real
function. We prove this property formally as follows. F(u, v) is complex in
general, so it can be expressed as the sum of a real and an imaginary part:
F(u, v) = R(u, v) + jI(u, v). Then, F*(u, v) = R(u, v) - jI(u, v). Also,
F(-u, -v) = R(-u, -v) + jI(-u, -v). But, as proved earlier, if f(x, y) is real
then F*(u, v) = F(-u, -v), which, based on the preceding two equations, means
that R(u, v) = R(-u, -v) and I(u, v) = -I( -u, -v). In view of Eqs. (4.6-11a)
and (4.6-11b), this proves that R is an even function and I is an odd function.
Next, we prove property 8. If f(x, y) is real we know from property 3 that
the real part of F(u, v) is even, so to prove property 8 all we have to do is show
that if f(x, y) is real and even then the imaginary part of F(u, v) is 0 (i.e., F is
real). The steps are as follows:
M-1 N-1
F(u, v) = a a f(x, y) e-j2p(ux>M + vy>N)
x=0 y=0
M-1 N-1
F(u, v) = a a [ fr (x, y)] e -j 2p(ux>M + vy>N)
x=0 y=0
M-1 N-1
= a a [fr (x, y)] e -j2 p(ux>M)e -j2p(vy> N)
x=0 y=0
M-1 N-1
= a a [even][even - jodd][even - jodd]
x=0 y=0
M-1 N-1
= a a [even][even # even - 2jeven # odd - odd # odd]
x=0 y=0
M-1 N-1
- a #
a [even even]
x=0 y=0
= real
The fourth step follows from Euler’s equation and the fact that the cos and sin
are even and odd functions, respectively. We also know from property 8 that, in
addition to being real, f is an even function. The only term in the penultimate
line containing imaginary components is the second term, which is 0 according
to Eq. (4.6-14). Thus, if f is real and even then F is real. As noted earlier, F is
also even because f is real. This concludes the proof.
Finally, we prove the validity of property 6. From the definition of the DFT,
ᑣ E f( -x, -y) F = a
M-1 N-1
-j2p(u[M - m] >M + v [N - n] > N)
a f(m, n) e
m=0 n=0
(To convince yourself that the summations are correct, try a 1-D transform
and expand a few terms by hand.) Because exp [-j2p(integer)] = 1, it
follows that
4.6 ■ Some Properties of the 2-D Discrete Fourier Transform 245
m=0 n=0
= F(-u, -v)
This concludes the proof. ■
I(u, v)
f(u, v) = arctan B R (4.6-17)
R(u, v)
is the phase angle. Recall from the discussion in Section 4.2.1 that the arctan
must be computed using a four-quadrant arctangent, such as MATLAB’s
atan2(Imag, Real) function.
Finally, the power spectrum is defined as
P(u, v) = ƒ F(u, v) ƒ 2
(4.6-18)
= R2(u, v) + I2(u, v)
As before, R and I are the real and imaginary parts of F(u, v) and all compu-
tations are carried out for the discrete variables u = 0, 1, 2, Á , M - 1 and
v = 0, 1, 2, Á , N - 1. Therefore, ƒ F(u, v) ƒ , f(u, v), and P(u, v) are arrays of
size M * N.
The Fourier transform of a real function is conjugate symmetric [Eq. (4.6-14)],
which implies that the spectrum has even symmetry about the origin:
ƒ F(u, v) ƒ = ƒ F(-u, -v) ƒ (4.6-19)
The phase angle exhibits the following odd symmetry about the origin:
f(u, v) = -f(-u, -v) (4.6-20)
It follows from Eq. (4.5-15) that
M-1 N-1
F(0, 0) = a a f(x, y)
x=0 y=0
246 Chapter 4 ■ Filtering in the Frequency Domain
a b y v
c d
FIGURE 4.24
(a) Image.
(b) Spectrum
showing bright spots
in the four corners.
(c) Centered
spectrum. (d) Result
showing increased
detail after a log
transformation. The
zero crossings of the
spectrum are closer in
x u
the vertical direction v v
because the rectangle
in (a) is longer in that
direction. The
coordinate
convention used
throughout the book
places the origin of
the spatial and
frequency domains at
the top left.
u u
1 M-1 N-1
MN xa a f(x, y)
F(0, 0) = MN
=0 y=0
= MNf(x, y) (4.6-21)
transform contains the highest values (and thus appears brighter in the image).
However, note that the four corners of the spectrum contain similarly high
values. The reason is the periodicity property discussed in the previous section.
To center the spectrum, we simply multiply the image in (a) by (-1)x + y before
computing the DFT, as indicated in Eq. (4.6-8). Figure 4.22(c) shows the result,
which clearly is much easier to visualize (note the symmetry about the center
point). Because the dc term dominates the values of the spectrum, the dynamic
range of other intensities in the displayed image are compressed. To bring out
those details, we perform a log transformation, as described in Section 3.2.2.
Figure 4.24(d) shows the display of (1 + log ƒ F(u, v) ƒ ). The increased rendition
of detail is evident. Most spectra shown in this and subsequent chapters are
scaled in this manner.
It follows from Eqs. (4.6-4) and (4.6-5) that the spectrum is insensitive to
image translation (the absolute value of the exponential term is 1), but it rotates
by the same angle of a rotated image. Figure 4.25 illustrates these properties.
The spectrum in Fig. 4.25(b) is identical to the spectrum in Fig. 4.24(d). Clearly,
the images in Figs. 4.24(a) and 4.25(a) are different, so if their Fourier spectra
are the same then, based on Eq. (4.6-15), their phase angles must be different.
Figure 4.26 confirms this. Figures 4.26(a) and (b) are the phase angle arrays
(shown as images) of the DFTs of Figs. 4.24(a) and 4.25(a). Note the lack of
similarity between the phase images, in spite of the fact that the only differences
between their corresponding images is simple translation. In general, visual
analysis of phase angle images yields little intuitive information. For instance,
due to its 45° orientation, one would expect intuitively that the phase angle in
a b
c d
FIGURE 4.25
(a) The rectangle
in Fig. 4.24(a)
translated,
and (b) the
corresponding
spectrum.
(c) Rotated
rectangle,
and (d) the
corresponding
spectrum. The
spectrum
corresponding to
the translated
rectangle is
identical to the
spectrum
corresponding to
the original image
in Fig. 4.24(a).
248 Chapter 4 ■ Filtering in the Frequency Domain
a b c
FIGURE 4.26 Phase angle array corresponding (a) to the image of the centered rectangle
in Fig. 4.24(a), (b) to the translated image in Fig. 4.25(a), and (c) to the rotated image in
Fig. 4.25(c).
Fig. 4.26(a) should correspond to the rotated image in Fig. 4.25(c), rather than to
the image in Fig. 4.24(a). In fact, as Fig. 4.26(c) shows, the phase angle of the ro-
tated image has a strong orientation that is much less than 45°. ■
EXAMPLE 4.14: ■ Figure 4.27(b) is the phase angle of the DFT of Fig. 4.27(a). There is no de-
Further tail in this array that would lead us by visual analysis to associate it with fea-
illustration of the
properties of the
tures in its corresponding image (not even the symmetry of the phase angle is
Fourier spectrum visible). However, the importance of the phase in determining shape charac-
and phase angle. teristics is evident in Fig. 4.27(c), which was obtained by computing the inverse
DFT of Eq. (4.6-15) using only phase information (i.e., with ƒ F(u, v) ƒ = 1 in
the equation). Although the intensity information has been lost (remember,
that information is carried by the spectrum) the key shape features in this
image are unmistakably from Fig. 4.27(a).
Figure 4.27(d) was obtained using only the spectrum in Eq. (4.6-15) and com-
puting the inverse DFT. This means setting the exponential term to 1, which in
turn implies setting the phase angle to 0.The result is not unexpected. It contains
only intensity information, with the dc term being the most dominant. There is
no shape information in the image because the phase was set to zero.
4.6 ■ Some Properties of the 2-D Discrete Fourier Transform 249
a b c
d e f
FIGURE 4.27 (a) Woman. (b) Phase angle. (c) Woman reconstructed using only the
phase angle. (d) Woman reconstructed using only the spectrum. (e) Reconstruction
using the phase angle corresponding to the woman and the spectrum corresponding to
the rectangle in Fig. 4.24(a). (f) Reconstruction using the phase of the rectangle and the
spectrum of the woman.
Finally, Figs. 4.27(e) and (f) show yet again the dominance of the phase in de-
termining the feature content of an image. Figure 4.27(e) was obtained by com-
puting the IDFT of Eq. (4.6-15) using the spectrum of the rectangle in Fig. 4.24(a)
and the phase angle corresponding to the woman. The shape of the woman
clearly dominates this result. Conversely, the rectangle dominates Fig. 4.27(f),
which was computed using the spectrum of the woman and the phase angle of
the rectangle. ■
where F and H are obtained using Eq. (4.5-15) and, as before, the double
arrow is used to indicate that the left and right sides of the expressions consti-
tute a Fourier transform pair. Our interest in the remainder of this chapter is in
Eq. (4.6-24), which states that the inverse DFT of the product F(u, v)H(u, v)
yields f(x, y) h(x, y), the 2-D spatial convolution of f and h. Similarly, the
DFT of the spatial convolution yields the product of the transforms in the fre-
quency domain. Equation (4.6-24) is the foundation of linear filtering and, as
explained in Section 4.7, is the basis for all the filtering techniques discussed in
this chapter.
Because we are dealing here with discrete quantities, computation of the
We discuss efficient ways Fourier transforms is carried out with a DFT algorithm. If we elect to compute
to compute the DFT in
Section 4.11. the spatial convolution using the IDFT of the product of the two transforms,
then the periodicity issues discussed in Section 4.6.3 must be taken into ac-
count. We give a 1-D example of this and then extend the conclusions to two
variables. The left column of Fig. 4.28 implements convolution of two functions,
f and h, using the 1-D equivalent of Eq. (3.4-2) which, because the two func-
tions are of same size, is written as
399
f(x) h(x) = a f(x)h(x - m)
m=0
This equation is identical to Eq. (4.4-10), but the requirement on the displace-
ment x is that it be sufficiently large to cause the flipped (rotated) version of h
to slide completely past f. In other words, the procedure consists of (1) mirror-
ing h about the origin (i.e., rotating it by 180°) [Fig. 4.28(c)], (2) translating the
mirrored function by an amount x [Fig. 4.28(d)], and (3) for each value x of
translation, computing the entire sum of products in the right side of the pre-
ceding equation. In terms of Fig. 4.28 this means multiplying the function in
Fig. 4.28(a) by the function in Fig. 4.28(d) for each value of x. The displacement
x ranges over all values required to completely slide h across f. Figure 4.28(e)
shows the convolution of these two functions. Note that convolution is a func-
tion of the displacement variable, x, and that the range of x required in this ex-
ample to completely slide h past f is from 0 to 799.
If we use the DFT and the convolution theorem to obtain the same result as
in the left column of Fig. 4.28, we must take into account the periodicity inher-
ent in the expression for the DFT. This is equivalent to convolving the two pe-
riodic functions in Figs. 4.28(f) and (g). The convolution procedure is the same
as we just discussed, but the two functions now are periodic. Proceeding with
these two functions as in the previous paragraph would yield the result in
Fig. 4.28(j) which obviously is incorrect. Because we are convolving two peri-
odic functions, the convolution itself is periodic. The closeness of the periods in
Fig. 4.28 is such that they interfere with each other to cause what is commonly
referred to as wraparound error. According to the convolution theorem, if we
had computed the DFT of the two 400-point functions, f and h, multiplied the
4.6 ■ Some Properties of the 2-D Discrete Fourier Transform 251
f(m) f (m) a f
b g
c h
3 3 d i
e j
m m FIGURE 4.28 Left
0 200 400 0 200 400 column:
h (m) h(m) convolution of
two discrete
functions
obtained using the
2 2 approach
discussed in
m m Section 3.4.2. The
0 200 400 0 200 400 result in (e) is
h (m) h(m) correct. Right
column:
Convolution of
the same
functions, but
taking into
m m
account the
0 200 400 0 200 400
periodicity
h (x m) h(x m)
implied by the
DFT. Note in (j)
how data from
x x adjacent periods
produce
m m wraparound error,
0 200 400 0 200 400 yielding an
f(x) g(x) f (x) g (x) incorrect
convolution
1200 result. To obtain
1200 the correct result,
600 600 function padding
must be used.
x x
0 200 400 600 800 0200 400
Range of
Fourier transform
computation
two transforms, and then computed the inverse DFT, we would have obtained
the erroneous 400-point segment of the convolution shown in Fig. 4.28(j).
Fortunately, the solution to the wraparound error problem is simple. Consider
two functions, f(x) and h(x) composed of A and B samples, respectively. It can be
shown (Brigham [1988]) that if we append zeros to both functions so that they
have the same length, denoted by P, then wraparound is avoided by choosing The zeros could be
appended also to the
beginning of the func-
P Ú A + B - 1 (4.6-26) tions, or they could be
divided between the
In our example, each function has 400 points, so the minimum value we could beginning and end of the
use is P = 799, which implies that we would append 399 zeros to the trailing functions. It is simpler
to append them at the
edge of each function. This process is called zero padding. As an exercise, you end.
252 Chapter 4 ■ Filtering in the Frequency Domain
should convince yourself that if the periods of the functions in Figs. 4.28(f) and
(g) were lengthened by appending to each period at least 399 zeros, the result
would be a periodic convolution in which each period is identical to the correct
result in Fig. 4.28(e). Using the DFT via the convolution theorem would result
in a 799-point spatial function identical to Fig. 4.28(e). The conclusion, then, is
that to obtain the same convolution result between the “straight” representa-
tion of the convolution equation approach in Chapter 3, and the DFT ap-
proach, functions in the latter must be padded prior to computing their
transforms.
Visualizing a similar example in 2-D would be more difficult, but we would
arrive at the same conclusion regarding wraparound error and the need for ap-
pending zeros to the functions. Let f(x, y) and h(x, y) be two image arrays of
sizes A * B and C * D pixels, respectively. Wraparound error in their circular
convolution can be avoided by padding these functions with zeros, as follows:
f(x, y) 0 … x … A - 1 and 0 … y … B - 1
fp (x, y) = b (4.6-27)
0 A … x … P or B … y … Q
and
h(x, y) 0 … x … C - 1 and 0 … y … D - 1
hp (x, y) = b (4.6-28)
0 C … x … P or D … y … Q
with
P Ú A + C-1 (4.6-29)
and
Q Ú B + D-1 (4.6-30)
The resulting padded images are of size P * Q. If both arrays are of the same
size, M * N, then we require that
P Ú 2M - 1 (4.6-31)
and
Q Ú 2N - 1 (4.6-32)
the end of the interval, then a discontinuity would be created when zeros were
appended to the function to eliminate wraparound error. This is analogous to
multiplying a function by a box, which in the frequency domain would imply
convolution of the original transform with a sinc function (see Example 4.1).
This, in turn, would create so-called frequency leakage, caused by the high-
frequency components of the sinc function. Leakage produces a blocky effect
on images. Although leakage never can be totally eliminated, it can be reduced
significantly by multiplying the sampled function by another function that ta-
pers smoothly to near zero at both ends of the sampled record to dampen the
sharp transitions (and thus the high frequency components) of the box. This ap-
proach, called windowing or apodizing, is an important consideration when fi- A simple apodizing func-
tion is a triangle, cen-
delity in image reconstruction (as in high-definition graphics) is desired. If you tered on the data record,
are faced with the need for windowing, a good approach is to use a 2-D Gaussian which tapers to 0 at both
ends of the record. This is
function (see Section 4.8.3). One advantage of this function is that its Fourier called the Bartlett win-
transform is Gaussian also, thus producing low leakage. dow. Other common win-
dows are the Hamming
and the Hann windows.
We can even use a
4.6.7 Summary of 2-D Discrete Fourier Transform Properties Gaussian function. We
return to the issue of
Table 4.2 summarizes the principal DFT definitions introduced in this chapter. windowing in Section
5.11.5.
Separability is discussed in Section 4.11.1 and obtaining the inverse using a
forward transform algorithm is discussed in Section 4.11.2. Correlation is dis-
cussed in Chapter 12.
TABLE 4.2
Name Expression(s)
Summary of DFT
1) Discrete Fourier M-1 N-1
definitions and
transform (DFT) F(u, v) = a a f(x, y) e-j2p(ux>M + vy>N) corresponding
of f(x, y) x=0 y=0 expressions.
2) Inverse discrete
Fourier transform 1 M-1 N-1 j2p(ux>M + vy>N)
MN ua a F(u, v) e
f(x, y) =
(IDFT) of F(u, v) =0 v=0
I(u, v)
5) Phase angle f(u, v) = tan-1 B R
R(u, v)
1 M-1 N-1 1
MN xa a f(x, y) = MN F(0, 0)
7) Average value f(x, y) =
=0 y=0
(Continued)
254 Chapter 4 ■ Filtering in the Frequency Domain
TABLE 4.2
Name Expression(s)
(Continued)
8) Periodicity (k1 and F(u, v) = F(u + k1M, v) = F(u, v + k2N)
k2 are integers) = F(u + k1M, v + k2N)
f(x, y) = f(x + k1M, y) = f(x, y + k2N)
= f(x + k1M, y + k2N)
M-1 N-1
9) Convolution f(x, y) h(x, y) = a a f(m, n)h(x - m, y - n)
m=0 n=0
M-1 N-1
10) Correlation f(x, y) h(x, y) = a a f*(m, n)h(x + m, y + n)
m=0 n=0
Table 4.3 summarizes some important DFT pairs. Although our focus is on
discrete functions, the last two entries in the table are Fourier transform pairs
that can be derived only for continuous variables (note the use of continuous
variable notation). We include them here because, with proper interpretation,
they are quite useful in digital image processing. The differentiation pair can
TABLE 4.3
Name DFT Pairs
Summary of DFT
pairs. The closed- 1) Symmetry See Table 4.1
form expressions properties
in 12 and 13 are
valid only for 2) Linearity af1(x, y) + bf2(x, y) 3 aF1(u, v) + bF2(u, v)
continuous 3) Translation f(x, y) e j2p(u0x>M + v0y>N) 3 F(u - u0, v - v0)
variables. They (general) f(x - x0, y - y0) 3 F(u, v)e-j2p(ux0/M + vy0/N)
can be used with
discrete variables 4) Translation f(x, y)(-1)x + y 3 F(u - M>2, v - N>2)
by sampling the to center of f(x - M>2, y - N>2) 3 F(u, v)(-1)u + v
closed-form, the frequency
continuous rectangle,
expressions. (M/2, N/2)
5) Rotation f(r, u + u0) 3 F(v, w + u0)
x = r cos u y = r sin u u = v cos w v = v sin w
6) Convolution f(x, y) h(x, y) 3 F(u, v)H(u, v)
theorem† f(x, y)h(x, y) 3 F(u, v) H(u, v)
(Continued)
4.7 ■ The Basics of Filtering in the Frequency Domain 255
TABLE 4.3
Name DFT Pairs
(Continued)
7) Correlation f(x, y) h(x, y) 3 F *(u, v) H(u, v)
theorem† f*(x, y)h(x, y) 3 F(u, v) H(u, v)
8) Discrete unit d(x, y) 3 1
impulse
sin(pua) sin(pvb) -jp(ua + vb)
9) Rectangle rect[a, b] 3 ab e
(pua) (pvb)
10) Sine sin(2pu0x + 2pv0y) 3
a b
FIGURE 4.29 (a) SEM image of a damaged integrated circuit. (b) Fourier spectrum of
(a). (Original image courtesy of Dr. J. M. Hudak, Brockhouse Institute for Materials
Research, McMaster University, Hamilton, Ontario, Canada.)
4.7 ■ The Basics of Filtering in the Frequency Domain 257
that is off-axis slightly to the left. This component was caused by the edges of
the oxide protrusions. Note how the angle of the frequency component with
respect to the vertical axis corresponds to the inclination (with respect to the
horizontal axis) of the long white element, and note also the zeros in the ver-
tical frequency component, corresponding to the narrow vertical span of the
oxide protrusions.
These are typical of the types of associations that can be made in general
between the frequency and spatial domains. As we show later in this chapter,
even these types of gross associations, coupled with the relationships men-
tioned previously between frequency content and rate of change of intensity
levels in an image, can lead to some very useful results. In the next section,
we show the effects of modifying various frequency ranges in the transform
of Fig. 4.29(a).
†
Many software implementations of the 2-D DFT (e.g., MATLAB) do not center the transform. This im-
plies that filter functions must be arranged to correspond to the same data format as the uncentered
transform (i.e., with the origin at the top left). The net result is that filters are more difficult to generate
and display. We use centering in our discussions to aid in visualization, which is crucial in developing a
clear understanding of filtering concepts. Either method can be used practice, as long as consistency is
maintained.
258 Chapter 4 ■ Filtering in the Frequency Domain
FIGURE 4.30
Result of filtering
the image in
Fig. 4.29(a) by
setting to 0 the
term F(M> 2, N> 2)
in the Fourier
transform.
H (u, v)
H (u, v)
H(u, v)
M/2 N/2 M/ 2 N/ 2
u
v
N/2 M/2 a
u v u v
a b c
d e f
FIGURE 4.31 Top row: frequency domain filters. Bottom row: corresponding filtered images obtained using
Eq. (4.7-1).We used a = 0.85 in (c) to obtain (f) (the height of the filter itself is 1). Compare (f) with Fig. 4.29(a).
a b c
FIGURE 4.32 (a) A simple image. (b) Result of blurring with a Gaussian lowpass filter without padding.
(c) Result of lowpass filtering with padding. Compare the light area of the vertical edges in (b) and (c).
260 Chapter 4 ■ Filtering in the Frequency Domain
a b
FIGURE 4.33 2-D image periodicity inherent in using the DFT. (a) Periodicity without
image padding. (b) Periodicity after padding with 0s (black). The dashed areas in the
center correspond to the image in Fig. 4.32(a). (The thin white lines in both images are
superimposed for clarity; they are not part of the data.)
dashed image, it will encompass part of the image and also part of the bottom
of the periodic image right above it. When a dark and a light region reside
under the filter, the result is a mid-gray, blurred output. However, when the fil-
ter is passing through the top right side of the image, the filter will encompass
only light areas in the image and its right neighbor. The average of a constant
is the same constant, so filtering will have no effect in this area, giving the re-
sult in Fig. 4.32(b). Padding the image with 0s creates a uniform border around
the periodic sequence, as Fig. 4.33(b) shows. Convolving the blurring function
with the padded “mosaic” of Fig. 4.33(b) gives the correct result in Fig. 4.32(c).
You can see from this example that failure to pad an image can lead to erro-
neous results. If the purpose of filtering is only for rough visual analysis, the
padding step is skipped sometimes.
Thus far, the discussion has centered on padding the input image, but
Eq. (4.7-1) also involves a filter that can be specified either in the spatial or in
the frequency domain. However, padding is done in the spatial domain, which
raises an important question about the relationship between spatial padding
and filters specified directly in the frequency domain.
At first glance, one could conclude that the way to handle padding of a
frequency domain filter is to construct the filter to be of the same size as the
image, compute the IDFT of the filter to obtain the corresponding spatial fil-
ter, pad that filter in the spatial domain, and then compute its DFT to return
to the frequency domain. The 1-D example in Fig. 4.34 illustrates the pitfalls in
this approach. Figure 4.34(a) shows a 1-D ideal lowpass filter in the frequency
domain. The filter is real and has even symmetry, so we know from property 8
in Table 4.1 that its IDFT will be real and symmetric also. Figure 4.34(b)
shows the result of multiplying the elements of the frequency domain filter
4.7 ■ The Basics of Filtering in the Frequency Domain 261
1.2 0.04 a c
b d
1
0.03 FIGURE 4.34
(a) Original filter
0.8
specified in the
0.6 0.02 (centered)
frequency domain.
0.4 (b) Spatial
0.01 representation
0.2 obtained by
computing the
0
0 IDFT of (a).
(c) Result of
0.2 0.01 padding (b) to twice
0 128 255 0 128 256 384 511 its length (note the
0.04 1.2 discontinuities).
(d) Corresponding
1 filter in the
0.03
frequency domain
0.8 obtained by
0.02 computing the DFT
0.6 of (c). Note the
ringing caused by
0.4
0.01 the discontinuities
in (c). (The curves
0.2
appear continuous
0 because the points
0
were joined to
0.01 0.2 simplify visual
0 128 255 0 128 256 384 511 analysis.)
by (-1)u and computing its IDFT to obtain the corresponding spatial filter.
The extremes of this spatial function are not zero so, as Fig. 4.34(c) shows,
zero-padding the function created two discontinuities (padding the two ends
of the function is the same as padding one end, as long as the total number of
zeros used is the same).
To get back to the frequency domain, we compute the DFT of the spatial,
padded filter. Figure 4.34(d) shows the result.The discontinuities in the spatial fil-
ter created ringing in its frequency domain counterpart, as you would expect
from the results in Example 4.1. Viewed another way, we know from that exam-
ple that the Fourier transform of a box function is a sinc function with frequency
components extending to infinity, and we would expect the same behavior from
the inverse transform of a box.That is, the spatial representation of an ideal (box) See the end of Section
4.3.3 regarding the defini-
frequency domain filter has components extending to infinity. Therefore, any tion of an ideal filter.
spatial truncation of the filter to implement zero-padding will introduce disconti-
nuities, which will then in general result in ringing in the frequency domain (trun-
cation can be avoided in this case if it is done at zero crossings, but we are
interested in general procedures, and not all filters have zero crossings).
What the preceding results tell us is that, because we cannot work with an infi-
nite number of components, we cannot use an ideal frequency domain filter [as in
262 Chapter 4 ■ Filtering in the Frequency Domain
Fig. 4.34(a)] and simultaneously use zero padding to avoid wraparound error. A
decision on which limitation to accept is required. Our objective is to work with
specified filter shapes in the frequency domain (including ideal filters) without
having to be concerned with truncation issues. One approach is to zero-pad im-
ages and then create filters in the frequency domain to be of the same size as the
padded images (remember, images and filters must be of the same size when
using the DFT). Of course, this will result in wraparound error because no
padding is used for the filter, but in practice this error is mitigated significantly by
the separation provided by the padding of the image, and it is preferable to ring-
ing. Smooth filters (such as those in Fig. 4.31) present even less of a problem.
Specifically, then, the approach we will follow in this chapter in order to work
with filters of a specified shape directly in the frequency domain is to pad images
to size P * Q and construct filters of the same dimensions. As explained ear-
lier, P and Q are given by Eqs. (4.6-29) and (4.6-30).
We conclude this section by analyzing the phase angle of the filtered trans-
form. Because the DFT is a complex array, we can express it in terms of its real
and imaginary parts:
The phase angle is not altered by filtering in the manner just described be-
cause H(u, v) cancels out when the ratio of the imaginary and real parts is
formed in Eq. (4.6-17). Filters that affect the real and imaginary parts equally,
and thus have no effect on the phase, are appropriately called zero-phase-shift
filters. These are the only types of filters considered in this chapter.
Even small changes in the phase angle can have dramatic (usually undesir-
able) effects on the filtered output. Figure 4.35 illustrates the effect of some-
thing as simple as a scalar change. Figure 4.35(a) shows an image resulting
from multiplying the angle array in Eq. (4.6-15) by 0.5, without changing
a b
FIGURE 4.35
(a) Image resulting
from multiplying by
0.5 the phase angle
in Eq. (4.6-15) and
then computing the
IDFT. (b) The
result of
multiplying the
phase by 0.25. The
spectrum was not
changed in either of
the two cases.
4.7 ■ The Basics of Filtering in the Frequency Domain 263
ƒ F(u, v) ƒ , and then computing the IDFT. The basic shapes remain unchanged,
but the intensity distribution is quite distorted. Figure 4.35(b) shows the result
of multiplying the phase by 0.25. The image is almost unrecognizable.
†
If H(u, v) is to be generated from a given spatial filter, h(x, y), then we form hp(x, y) by padding the
spatial filter to size P * Q, multiply the expanded array by (-1)x + y, and compute the DFT of the result
to obtain a centered H(u, v). Example 4.15 illustrates this procedure.
264 Chapter 4 ■ Filtering in the Frequency Domain
a b c
d e f
g h
FIGURE 4.36
(a) An M * N
image, f.
(b) Padded image,
fp of size P * Q.
(c) Result of
multiplying fp by
(-1)x + y.
(d) Spectrum of
Fp. (e) Centered
Gaussian lowpass
filter, H, of size
P * Q.
(f) Spectrum of
the product HFp.
(g) gp, the product
of (-1)x + y and
the real part of
the IDFT of HFp.
(h) Final result, g,
obtained by
cropping the first
M rows and N
columns of gp.
spatial domain. Conversely, it follows from a similar analysis and the convolu-
tion theorem that, given a spatial filter, we obtain its frequency domain repre-
sentation by taking the forward Fourier transform of the spatial filter.
Therefore, the two filters form a Fourier transform pair:
where h(x, y) is a spatial filter. Because this filter can be obtained from the re-
sponse of a frequency domain filter to an impulse, h(x, y) sometimes is re-
ferred to as the impulse response of H(u, v). Also, because all quantities in a
discrete implementation of Eq. (4.7-4) are finite, such filters are called finite
impulse response (FIR) filters. These are the only types of linear spatial filters
considered in this book.
We introduced spatial convolution in Section 3.4.1 and discussed its imple-
mentation in connection with Eq. (3.4-2), which involved convolving func-
tions of different sizes. When we speak of spatial convolution in terms of the
4.7 ■ The Basics of Filtering in the Frequency Domain 265
convolution theorem and the DFT, it is implied that we are convolving peri-
odic functions, as explained in Fig. 4.28. For this reason, as explained earlier,
Eq. (4.6-23) is referred to as circular convolution. Furthermore, convolution
in the context of the DFT involves functions of the same size, whereas in
Eq. (3.4-2) the functions typically are of different sizes.
In practice, we prefer to implement convolution filtering using Eq. (3.4-2)
with small filter masks because of speed and ease of implementation in
hardware and/or firmware. However, filtering concepts are more intuitive in
the frequency domain. One way to take advantage of the properties of both
domains is to specify a filter in the frequency domain, compute its IDFT,
and then use the resulting, full-size spatial filter as a guide for constructing
smaller spatial filter masks (more formal approaches are mentioned in
Section 4.11.4). This is illustrated next. Later in this section, we illustrate
also the converse, in which a small spatial filter is given and we obtain its
full-size frequency domain representation. This approach is useful for ana-
lyzing the behavior of small spatial filters in the frequency domain. Keep in
mind during the following discussion that the Fourier transform and its in-
verse are linear processes (Problem 4.14), so the discussion is limited to lin-
ear filtering.
In the following discussion, we use Gaussian filters to illustrate how
frequency domain filters can be used as guides for specifying the coefficients
of some of the small masks discussed in Chapter 3. Filters based on Gaussian
functions are of particular interest because, as noted in Table 4.3, both the
forward and inverse Fourier transforms of a Gaussian function are real
Gaussian functions. We limit the discussion to 1-D to illustrate the underly-
ing principles. Two-dimensional Gaussian filters are discussed later in this
chapter.
Let H(u) denote the 1-D frequency domain Gaussian filter:
H(u) = A e -u >2s
2 2
(4.7-5)
These equations† are important for two reasons: (1) They are a Fourier trans-
form pair, both components of which are Gaussian and real. This facilitates
analysis because we do not have to be concerned with complex numbers. In
addition, Gaussian curves are intuitive and easy to manipulate. (2) The func-
tions behave reciprocally. When H(u) has a broad profile (large value of s),
†
As mentioned in Table 4.3, closed forms for the forward and inverse Fourier transforms of Gaussians
are valid only for continuous functions. To use discrete formulations we simply sample the continuous
Gaussian transforms. Our use of discrete variables here implies that we are dealing with sampled
transforms.
266 Chapter 4 ■ Filtering in the Frequency Domain
h(x) has a narrow profile, and vice versa. In fact, as s approaches infinity, H(u)
tends toward a constant function and h(x) tends toward an impulse, which im-
plies no filtering in the frequency and spatial domains, respectively.
Figures 4.37(a) and (b) show plots of a Gaussian lowpass filter in the fre-
quency domain and the corresponding lowpass filter in the spatial domain.
Suppose that we want to use the shape of h(x) in Fig. 4.37(b) as a guide for
specifying the coefficients of a small spatial mask. The key similarity be-
tween the two filters is that all their values are positive. Thus, we conclude
that we can implement lowpass filtering in the spatial domain by using a
mask with all positive coefficients (as we did in Section 3.5.1). For reference,
Fig. 4.37(b) shows two of the masks discussed in that section. Note the recip-
rocal relationship between the width of the filters, as discussed in the previ-
ous paragraph. The narrower the frequency domain filter, the more it will
attenuate the low frequencies, resulting in increased blurring. In the spatial
domain, this means that a larger mask must be used to increase blurring, as
illustrated in Example 3.13.
More complex filters can be constructed using the basic Gaussian function
of Eq. (4.7-5). For example, we can construct a highpass filter as the difference
of Gaussians:
Figures 4.37(c) and (d) show plots of these two equations. We note again the
reciprocity in width, but the most important feature here is that h(x) has a pos-
itive center term with negative terms on either side. The small masks shown in
a c H (u) H (u)
b d
FIGURE 4.37
(a) A 1-D Gaussian
lowpass filter in the
frequency domain.
(b) Spatial
lowpass filter
corresponding to u u
(a). (c) Gaussian h(x) h (x)
highpass filter in
the frequency 1 1 1 1 1 1
1
domain. (d) Spatial ––
9
1
1
1 1
1 1
1 8 1
1 1 1
highpass filter 1 2 1 0 1 0
corresponding to 1
–– 2
16
4 2 1 4 1
1 2 1 0 1 0
(c). The small 2-D
masks shown are x x
spatial filters we
used in Chapter 3.
4.7 ■ The Basics of Filtering in the Frequency Domain 267
Fig. 4.37(d) “capture” this property. These two masks were used in Chapter 3
as sharpening filters, which we now know are highpass filters.
Although we have gone through significant effort to get here, be assured
that it is impossible to truly understand filtering in the frequency domain
without the foundation we have just established. In practice, the frequency
domain can be viewed as a “laboratory” in which we take advantage of the
correspondence between frequency content and image appearance. As is
demonstrated numerous times later in this chapter, some tasks that would be
exceptionally difficult or impossible to formulate directly in the spatial do-
main become almost trivial in the frequency domain. Once we have selected a
specific filter via experimentation in the frequency domain, the actual imple-
mentation of the method usually is done in the spatial domain. One approach
is to specify small spatial masks that attempt to capture the “essence” of the
full filter function in the spatial domain, as we explained in Fig. 4.37. A more
formal approach is to design a 2-D digital filter by using approximations
based on mathematical or statistical criteria. We touch on this point again in
Section 4.11.4.
■ In this example, we start with a spatial mask and show how to generate its EXAMPLE 4.15:
corresponding filter in the frequency domain. Then, we compare the filtering Obtaining a
results obtained using frequency domain and spatial techniques. This type of frequency domain
filter from a small
analysis is useful when one wishes to compare the performance of given spa-
spatial mask.
tial masks against one or more “full” filter candidates in the frequency do-
main, or to gain deeper understanding about the performance of a mask. To
keep matters simple, we use the 3 * 3 Sobel vertical edge detector from
Fig. 3.41(e). Figure 4.38(a) shows a 600 * 600 pixel image, f(x, y), that we wish
to filter, and Fig. 4.38(b) shows its spectrum.
Figure 4.39(a) shows the Sobel mask, h(x, y) (the perspective plot is ex-
plained below). Because the input image is of size 600 * 600 pixels and the fil-
ter is of size 3 * 3 we avoid wraparound error by padding f and h to size
a b
FIGURE 4.38
(a) Image of a
building, and
(b) its spectrum.
268 Chapter 4 ■ Filtering in the Frequency Domain
a b
c d 1 0 1
FIGURE 4.39
2 0 2
(a) A spatial
mask and
1 0 1
perspective plot
of its
corresponding
frequency domain
filter. (b) Filter
shown as an
image. (c) Result
of filtering
Fig. 4.38(a) in the
frequency domain
with the filter in
(b). (d) Result of
filtering the same
image with the
spatial filter in
(a). The results
are identical.
602 * 602 pixels, according to Eqs. (4.6-29) and (4.6-30). The Sobel mask ex-
hibits odd symmetry, provided that it is embedded in an array of zeros of even
size (see Example 4.10). To maintain this symmetry, we place h(x, y) so that its
center is at the center of the 602 * 602 padded array. This is an important as-
pect of filter generation. If we preserve the odd symmetry with respect to the
padded array in forming hp(x, y), we know from property 9 in Table 4.1 that
H(u, v) will be purely imaginary. As we show at the end of this example, this
will yield results that are identical to filtering the image spatially using h(x, y).
If the symmetry were not preserved, the results would no longer be same.
The procedure used to generate H(u, v) is: (1) multiply hp(x, y) by (-1)x + y
to center the frequency domain filter; (2) compute the forward DFT of the re-
sult in (1); (3) set the real part of the resulting DFT to 0 to account for parasitic
real parts (we know that H(u, v) has to be purely imaginary); and (4) multiply
the result by (-1)u + v. This last step reverses the multiplication of H(u, v) by
(-1)u + v, which is implicit when h(x, y) was moved to the center of hp(x, y).
Figure 4.39(a) shows a perspective plot of H(u, v), and Fig. 4.39(b) shows
4.8 ■ Image Smoothing Using Frequency Domain Filters 269
H(u, v) as an image. As, expected, the function is odd, thus the antisymmetry
about its center. Function H(u, v) is used as any other frequency domain filter
in the procedure outlined in Section 4.7.3.
Figure 4.39(c) is the result of using the filter just obtained in the proce-
dure outlined in Section 4.7.3 to filter the image in Fig. 4.38(a). As expected
from a derivative filter, edges are enhanced and all the constant intensity
areas are reduced to zero (the grayish tone is due to scaling for display).
Figure 4.39(d) shows the result of filtering the same image in the spatial do-
main directly, using h(x, y) in the procedure outlined in Section 3.6.4. The re-
sults are identical. ■
1 if D(u, v) … D0
H(u, v) = b (4.8-1)
0 if D(u, v) 7 D0
where D0 is a positive constant and D(u, v) is the distance between a point (u, v)
in the frequency domain and the center of the frequency rectangle; that is,
where, as before, P and Q are the padded sizes from Eqs. (4.6-31) and (4.6-32).
Figure 4.40(a) shows a perspective plot of H(u, v) and Fig. 4.40(b) shows the
filter displayed as an image. As mentioned in Section 4.3.3, the name ideal
indicates that all frequencies on or inside a circle of radius D0 are passed
270 Chapter 4 ■ Filtering in the Frequency Domain
H(u, v) H (u, v)
v
1
u v
D (u, v)
D0
u
a b c
FIGURE 4.40 (a) Perspective plot of an ideal lowpass-filter transfer function. (b) Filter displayed as an image.
(c) Filter radial cross section.
without attenuation, whereas all frequencies outside the circle are completely
attenuated (filtered out). The ideal lowpass filter is radially symmetric about
the origin, which means that the filter is completely defined by a radial cross
section, as Fig. 4.40(c) shows. Rotating the cross section by 360° yields the fil-
ter in 2-D.
For an ILPF cross section, the point of transition between H(u, v) = 1 and
H(u, v) = 0 is called the cutoff frequency. In the case of Fig. 4.40, for example,
the cutoff frequency is D0. The sharp cutoff frequencies of an ILPF cannot be
realized with electronic components, although they certainly can be simulated
in a computer. The effects of using these “nonphysical” filters on a digital
image are discussed later in this section.
The lowpass filters introduced in this chapter are compared by studying
their behavior as a function of the same cutoff frequencies. One way to estab-
lish a set of standard cutoff frequency loci is to compute circles that enclose
specified amounts of total image power PT. This quantity is obtained by sum-
ming the components of the power spectrum of the padded images at each
point (u, v), for u = 0, 1, Á , P - 1 and v = 0, 1, Á , Q - 1; that is,
P-1 Q-1
PT = a a P(u, v) (4.8-3)
u=0 v=0
where P(u, v) is given in Eq. (4.6-18). If the DFT has been centered, a circle of
radius D0 with origin at the center of the frequency rectangle encloses a per-
cent of the power, where
and the summation is taken over values of (u, v) that lie inside the circle or on
its boundary.
4.8 ■ Image Smoothing Using Frequency Domain Filters 271
Figures 4.41(a) and (b) show a test pattern image and its spectrum. The
circles superimposed on the spectrum have radii of 10, 30, 60, 160, and 460
pixels, respectively. These circles enclose a percent of the image power, for
a = 87.0, 93.1, 95.7, 97.8, and 99.2%, respectively. The spectrum falls off
rapidly, with 87% of the total power being enclosed by a relatively small
circle of radius 10.
■ Figure 4.42 shows the results of applying ILPFs with cutoff frequencies at EXAMPLE 4.16:
the radii shown in Fig. 4.41(b). Figure 4.42(b) is useless for all practical pur- Image smoothing
using an ILPF.
poses, unless the objective of blurring is to eliminate all detail in the image,
except the “blobs” representing the largest objects. The severe blurring in
this image is a clear indication that most of the sharp detail information in
the picture is contained in the 13% power removed by the filter. As the filter
radius increases, less and less power is removed, resulting in less blurring.
Note that the images in Figs. 4.42(c) through (e) are characterized by “ring-
ing,” which becomes finer in texture as the amount of high frequency con-
tent removed decreases. Ringing is visible even in the image [Fig. 4.42(e)] in
which only 2% of the total power was removed. This ringing behavior is a
characteristic of ideal filters, as you will see shortly. Finally, the result for
a = 99.2 shows very slight blurring in the noisy squares but, for the most
part, this image is quite close to the original. This indicates that little edge
information is contained in the upper 0.8% of the spectrum power in this
particular case.
It is clear from this example that ideal lowpass filtering is not very practi-
cal. However, it is useful to study their behavior as part of our development of
a b
FIGURE 4.41 (a) Test pattern of size 688 * 688 pixels, and (b) its Fourier spectrum. The
spectrum is double the image size due to padding but is shown in half size so that it fits
in the page. The superimposed circles have radii equal to 10, 30, 60, 160, and 460 with
respect to the full-size spectrum image. These radii enclose 87.0, 93.1, 95.7, 97.8, and
99.2% of the padded image power, respectively.
272 Chapter 4 ■ Filtering in the Frequency Domain
a b
c d
e f
FIGURE 4.42 (a) Original image. (b)–(f) Results of filtering using ILPFs with cutoff
frequencies set at radii values 10, 30, 60, 160, and 460, as shown in Fig. 4.41(b). The
power removed by these filters was 13, 6.9, 4.3, 2.2, and 0.8% of the total, respectively.
4.8 ■ Image Smoothing Using Frequency Domain Filters 273
filtering concepts. Also, as shown in the discussion that follows, some interest-
ing insight is gained by attempting to explain the ringing property of ILPFs in
the spatial domain. ■
The blurring and ringing properties of ILPFs can be explained using the
convolution theorem. Figure 4.43(a) shows the spatial representation, h(x, y), of
an ILPF of radius 10, and Fig. 4.43(b) shows the intensity profile of a line passing
through the center of the image. Because a cross section of the ILPF in the fre-
quency domain looks like a box filter, it is not unexpected that a cross section of
the corresponding spatial filter has the shape of a sinc function. Filtering in the
spatial domain is done by convolving h(x, y) with the image. Imagine each pixel
in the image being a discrete impulse whose strength is proportional to the in-
tensity of the image at that location. Convolving a sinc with an impulse copies
the sinc at the location of the impulse. The center lobe of the sinc is the principal
cause of blurring, while the outer, smaller lobes are mainly responsible for ring-
ing. Convolving the sinc with every pixel in the image provides a nice model for
explaining the behavior of ILPFs. Because the “spread” of the sinc function is in-
versely proportional to the radius of H(u, v), the larger D0 becomes, the more
the spatial sinc approaches an impulse which, in the limit, causes no blurring at
all when convolved with the image. This type of reciprocal behavior should be
routine to you by now. In the next two sections, we show that it is possible to
achieve blurring with little or no ringing, which is an important objective in
lowpass filtering.
a b
FIGURE 4.43
(a) Representation
in the spatial
domain of an
ILPF of radius 5
and size
1000 * 1000.
(b) Intensity
profile of a
horizontal line
passing through
the center of the
image.
274 Chapter 4 ■ Filtering in the Frequency Domain
H(u, v) H (u, v)
v 1.0
n1
0.5 n2
n3
u n4
v
D(u, v)
D0
u
a b c
FIGURE 4.44 (a) Perspective plot of a Butterworth lowpass-filter transfer function. (b) Filter displayed as an
image. (c) Filter radial cross sections of orders 1 through 4.
Unlike the ILPF, the BLPF transfer function does not have a sharp discon-
tinuity that gives a clear cutoff between passed and filtered frequencies. For
filters with smooth transfer functions, defining a cutoff frequency locus at
points for which H(u, v) is down to a certain fraction of its maximum value is
customary. In Eq. (4.8-5), (down 50% from its maximum value of 1) when
D(u, v) = D0.
EXAMPLE 4.17: ■ Figure 4.45 shows the results of applying the BLPF of Eq. (4.8-5) to
Image smoothing Fig. 4.45(a), with n = 2 and D0 equal to the five radii in Fig. 4.41(b). Unlike the
with a
results in Fig. 4.42 for the ILPF, we note here a smooth transition in blurring as
Butterworth
lowpass filter. a function of increasing cutoff frequency. Moreover, no ringing is visible in any
of the images processed with this particular BLPF, a fact attributed to the fil-
ter’s smooth transition between low and high frequencies. ■
a b
c d
e f
FIGURE 4.45 (a) Original image. (b)–(f) Results of filtering using BLPFs of order 2,
with cutoff frequencies at the radii shown in Fig. 4.41. Compare with Fig. 4.42.
276 Chapter 4 ■ Filtering in the Frequency Domain
a b c d
FIGURE 4.46 (a)–(d) Spatial representation of BLPFs of order 1, 2, 5, and 20, and corresponding intensity
profiles through the center of the filters (the size in all cases is 1000 * 1000 and the cutoff frequency is 5).
Observe how ringing increases as a function of filter order.
where, as in Eq. (4.8-2), D(u, v) is the distance from the center of the frequency
rectangle. Here we do not use a multiplying constant as in Section 4.7.4 in
order to be consistent with the filters discussed in the present section, whose
highest value is 1. As before, s is a measure of spread about the center. By let-
ting s = D0, we can express the filter using the notation of the other filters in
this section:
2
(u, v)>2D02
H(u, v) = e -D (4.8-7)
where D0 is the cutoff frequency. When D(u, v) = D0, the GLPF is down to
0.607 of its maximum value.
As Table 4.3 shows, the inverse Fourier transform of the GLPF is Gaussian
also. This means that a spatial Gaussian filter, obtained by computing the
IDFT of Eq. (4.8-6) or (4.8-7), will have no ringing. Figure 4.47 shows a per-
spective plot, image display, and radial cross sections of a GLPF function, and
Table 4.4 summarizes the lowpass filters discussed in this section.
4.8 ■ Image Smoothing Using Frequency Domain Filters 277
H(u, v) H (u, v)
v 1.0
D0 10
0.667 D0 20
D0 40
D0 100
u v
D(u, v)
u
a b c
FIGURE 4.47 (a) Perspective plot of a GLPF transfer function. (b) Filter displayed as an image. (c) Filter
radial cross sections for various values of D0.
TABLE 4.4
Lowpass filters. D0 is the cutoff frequency and n is the order of the Butterworth filter.
1 if D(u, v) … D0 1 2 2
H(u, v) = b H(u, v) = H(u, v) = e -D (u,v)>2D0
0 if D(u, v) 7 D0 1 + [D(u, v)>D0] 2n
■ Figure 4.48 shows the results of applying the GLPF of Eq. (4.8-7) to EXAMPLE 4.18:
Fig. 4.48(a), with D0 equal to the five radii in Fig. 4.41(b). As in the case of the Image smoothing
BLPF of order 2 (Fig. 4.45), we note a smooth transition in blurring as a func- with a Gaussian
lowpass filter.
tion of increasing cutoff frequency. The GLPF achieved slightly less smoothing
than the BLPF of order 2 for the same value of cutoff frequency, as can be
seen, for example, by comparing Figs. 4.45(c) and 4.48(c). This is expected, be-
cause the profile of the GLPF is not as “tight” as the profile of the BLPF of
order 2. However, the results are quite comparable, and we are assured of no
ringing in the case of the GLPF. This is an important characteristic in practice,
especially in situations (e.g., medical imaging) in which any type of artifact is
unacceptable. In cases where tight control of the transition between low and
high frequencies about the cutoff frequency are needed, then the BLPF pre-
sents a more suitable choice. The price of this additional control over the filter
profile is the possibility of ringing. ■
a b
c d
e f
FIGURE 4.48 (a) Original image. (b)–(f) Results of filtering using GLPFs with cutoff
frequencies at the radii shown in Fig. 4.41. Compare with Figs. 4.42 and 4.45.
4.8 ■ Image Smoothing Using Frequency Domain Filters 279
a b
FIGURE 4.49
(a) Sample text of
low resolution
(note broken
characters in
magnified view).
(b) Result of
filtering with a
GLPF (broken
character
segments were
joined).
satellite and aerial images. Similar results can be obtained using the lowpass
spatial filtering techniques discussed in Section 3.5.
Figure 4.49 shows a sample of text of poor resolution. One encounters text
like this, for example, in fax transmissions, duplicated material, and historical
records. This particular sample is free of additional difficulties like smudges,
creases, and torn sections. The magnified section in Fig. 4.49(a) shows that the
characters in this document have distorted shapes due to lack of resolution,
and many of the characters are broken. Although humans fill these gaps visu-
ally without difficulty, machine recognition systems have real difficulties read-
ing broken characters. One approach for handling this problem is to bridge
small gaps in the input image by blurring it. Figure 4.49(b) shows how well
characters can be “repaired” by this simple process using a Gaussian lowpass
filter with D0 = 80. The images are of size 444 * 508 pixels.
Lowpass filtering is a staple in the printing and publishing industry, where it
is used for numerous preprocessing functions, including unsharp masking, as We discuss unsharp
masking in the frequency
discussed in Section 3.6.3. “Cosmetic” processing is another use of lowpass fil- domain in Section 4.9.5
tering prior to printing. Figure 4.50 shows an application of lowpass filtering
for producing a smoother, softer-looking result from a sharp original. For
human faces, the typical objective is to reduce the sharpness of fine skin lines
and small blemishes. The magnified sections in Figs. 4.50(b) and (c) clearly
show a significant reduction in fine skin lines around the eyes in this case. In
fact, the smoothed images look quite soft and pleasing.
Figure 4.51 shows two applications of lowpass filtering on the same image,
but with totally different objectives. Figure 4.51(a) is an 808 * 754 very high
resolution radiometer (VHRR) image showing part of the Gulf of Mexico
(dark) and Florida (light), taken from a NOAA satellite (note the horizontal
sensor scan lines). The boundaries between bodies of water were caused by
loop currents. This image is illustrative of remotely sensed images in which sen-
sors have the tendency to produce pronounced scan lines along the direction in
which the scene is being scanned (see Example 4.24 for an illustration of a
280 Chapter 4 ■ Filtering in the Frequency Domain
a b c
FIGURE 4.50 (a) Original image (784 * 732 pixels). (b) Result of filtering using a GLPF with D0 = 100.
(c) Result of filtering using a GLPF with D0 = 80. Note the reduction in fine skin lines in the magnified
sections in (b) and (c).
physical cause). Lowpass filtering is a crude but simple way to reduce the effect
of these lines, as Fig. 4.51(b) shows (we consider more effective approaches in
Sections 4.10 and 5.4.1). This image was obtained using a GLFP with D0 = 50.
The reduction in the effect of the scan lines can simplify the detection of fea-
tures such as the interface boundaries between ocean currents.
Figure 4.51(c) shows the result of significantly more aggressive Gaussian
lowpass filtering with D0 = 20. Here, the objective is to blur out as much de-
tail as possible while leaving large features recognizable. For instance, this type
of filtering could be part of a preprocessing stage for an image analysis system
that searches for features in an image bank. An example of such features could
be lakes of a given size, such as Lake Okeechobee in the lower eastern region
of Florida, shown as a nearly round dark region in Fig. 4.51(c). Lowpass filter-
ing helps simplify the analysis by averaging out features smaller than the ones
of interest.
a b c
FIGURE 4.51 (a) Image showing prominent horizontal scan lines. (b) Result of filtering using a GLPF with
D0 = 50. (c) Result of using a GLPF with D0 = 20. (Original image courtesy of NOAA.)
4.8, we consider only zero-phase-shift filters that are radially symmetric. All
filtering in this section is based on the procedure outlined in Section 4.7.3, so
all filter functions, H(u, v), are understood to be discrete functions of size
P * Q; that is, the discrete frequency variables are in the range
u = 0, 1, 2, Á , P - 1 and v = 0, 1, 2, Á , Q - 1.
A highpass filter is obtained from a given lowpass filter using the equation
where HLP (u, v) is the transfer function of the lowpass filter. That is, when the
lowpass filter attenuates frequencies, the highpass filter passes them, and vice
versa.
In this section, we consider ideal, Butterworth, and Gaussian highpass fil-
ters. As in the previous section, we illustrate the characteristics of these filters
in both the frequency and spatial domains. Figure 4.52 shows typical 3-D plots,
image representations, and cross sections for these filters. As before, we see
that the Butterworth filter represents a transition between the sharpness of
the ideal filter and the broad smoothness of the Gaussian filter. Figure 4.53,
discussed in the sections that follow, illustrates what these filters look like in
the spatial domain. The spatial filters were obtained and displayed by using the
procedure used to generate Figs. 4.43 and 4.46.
0 if D(u, v) … D0
H(u, v) = b (4.9-2)
1 if D(u, v) 7 D0
282 Chapter 4 ■ Filtering in the Frequency Domain
H(u, v) H(u, v)
v 1.0
u v
D(u, v)
u
H(u, v) H(u, v)
v 1.0
u v
D(u, v)
u
H(u, v) H(u, v)
v 1.0
u v
a b c D(u, v)
d e f
g h i u
FIGURE 4.52 Top row: Perspective plot, image representation, and cross section of a typical ideal highpass
filter. Middle and bottom rows: The same sequence for typical Butterworth and Gaussian highpass filters.
where D0 is the cutoff frequency and D(u, v) is given by Eq. (4.8-2). This ex-
pression follows directly from Eqs. (4.8-1) and (4.9-1). As intended, the IHPF
is the opposite of the ILPF in the sense that it sets to zero all frequencies inside
a circle of radius D0 while passing, without attenuation, all frequencies outside
the circle. As in the case of the ILPF, the IHPF is not physically realizable. How-
ever, we consider it here for completeness and, as before, because its proper-
ties can be used to explain phenomena such as ringing in the spatial domain.
The discussion will be brief.
Because of the way in which they are related [Eq. (4.9-1)], we can expect
IHPFs to have the same ringing properties as ILPFs. This is demonstrated
4.9 ■ Image Sharpening Using Frequency Domain Filters 283
~ ~ ~
a b c
FIGURE 4.53 Spatial representation of typical (a) ideal, (b) Butterworth, and (c) Gaussian frequency domain
highpass filters, and corresponding intensity profiles through their centers.
clearly in Fig. 4.54, which consists of various IHPF results using the original
image in Fig. 4.41(a) with D0 set to 30, 60, and 160 pixels, respectively. The ring-
ing in Fig. 4.54(a) is so severe that it produced distorted, thickened object
boundaries (e.g., look at the large letter “a”). Edges of the top three circles do
not show well because they are not as strong as the other edges in the image
(the intensity of these three objects is much closer to the background intensity,
a b c
FIGURE 4.54 Results of highpass filtering the image in Fig. 4.41(a) using an IHPF with D0 = 30, 60, and 160.
284 Chapter 4 ■ Filtering in the Frequency Domain
1
1 + [D0>D(u, v)]2n
H(u, v) = (4.9-3)
where D(u, v) is given by Eq. (4.8-2). This expression follows directly from
Eqs. (4.8-5) and (4.9-1). The middle row of Fig. 4.52 shows an image and cross
section of the BHPF function.
As with lowpass filters, we can expect Butterworth highpass filters to be-
have smoother than IHPFs. Figure 4.55 shows the performance of a BHPF, of
a b c
FIGURE 4.55 Results of highpass filtering the image in Fig. 4.41(a) using a BHPF of order 2 with D0 = 30, 60,
and 160, corresponding to the circles in Fig. 4.41(b). These results are much smoother than those obtained
with an IHPF.
4.9 ■ Image Sharpening Using Frequency Domain Filters 285
a b c
FIGURE 4.56 Results of highpass filtering the image in Fig. 4.41(a) using a GHPF with D0 = 30, 60, and 160,
corresponding to the circles in Fig. 4.41(b). Compare with Figs. 4.54 and 4.55.
order 2 and with D0 set to the same values as in Fig. 4.54. The boundaries are
much less distorted than in Fig. 4.54, even for the smallest value of cutoff fre-
quency. Because the spot sizes in the center areas of the IHPF and the BHPF
are similar [see Figs. 4.53(a) and (b)], the performance of the two filters on the
smaller objects is comparable. The transition into higher values of cutoff fre-
quencies is much smoother with the BHPF.
TABLE 4.5
Highpass filters. D0 is the cutoff frequency and n is the order of the Butterworth filter.
1 if D(u, v) … D0 1 2 2
H(u, v) = b H(u, v) = 1 - e -D (u,v)>2D0
1 + [D0 >D(u, v)]2n
H(u, v) =
0 if D(u, v) 7 D0
286 Chapter 4 ■ Filtering in the Frequency Domain
EXAMPLE 4.19: ■ Figure 4.57(a) is a 1026 * 962 image of a thumb print in which smudges
Using highpass (a typical problem) are evident. A key step in automated fingerprint recog-
filtering and
nition is enhancement of print ridges and the reduction of smudges. En-
thresholding for
image hancement is useful also in human interpretation of prints. In this example,
enhancement. we use highpass filtering to enhance the ridges and reduce the effects of
smudging. Enhancement of the ridges is accomplished by the fact that they
contain high frequencies, which are unchanged by a highpass filter. On the
other hand, the filter reduces low frequency components, which correspond
to slowly varying intensities in the image, such as the background and
smudges. Thus, enhancement is achieved by reducing the effect of all fea-
tures except those with high frequencies, which are the features of interest
in this case.
Figure 4.57(b) is the result of using a Butterworth highpass filter of order 4
The value D0 = 50 is ap- with a cutoff frequency of 50. As expected, the highpass-filtered image lost its
proximately 2.5% of the
short dimension of the gray tones because the dc term was reduced to 0. The net result is that dark
padded image. The idea tones typically predominate in highpass-filtered images, thus requiring addi-
is for D0 to be close to
the origin so low fre- tional processing to enhance details of interest. A simple approach is to thresh-
quencies are attenuated, old the filtered image. Figure 4.57(c) shows the result of setting to black all
but not completely elimi-
nated. A range of 2% to negative values and to white all positive values in the filtered image. Note how
5% of the short dimen- the ridges are clear and the effect of the smudges has been reduced consider-
sion is a good starting
point. ably. In fact, ridges that are barely visible in the top, right section of the image
in Fig. 4.57(a) are nicely enhanced in Fig. 4.57(c). ■
a b c
FIGURE 4.57 (a) Thumb print. (b) Result of highpass filtering (a). (c) Result of
thresholding (b). (Original image courtesy of the U.S. National Institute of Standards
and Technology.)
4.9 ■ Image Sharpening Using Frequency Domain Filters 287
or, with respect to the center of the frequency rectangle, using the filter
where D(u, v) is the distance function given in Eq. (4.8-2). Then, the Laplacian
image is obtained as:
where F(u, v) is the DFT of f(x, y). As explained in Section 3.6.2, enhance-
ment is achieved using the equation:
Although this result is elegant, it has the same scaling issues just mentioned,
compounded by the fact that the normalizing factor is not as easily computed.
For this reason, Eq. (4.9-8) is the preferred implementation in the frequency
domain, with §2f(x, y) computed using Eq. (4.9-7) and scaled using the ap-
proach mentioned in the previous paragraph.
■ Figure 4.58(a) is the same as Fig. 3.38(a), and Fig. 4.58(b) shows the result of EXAMPLE 4.20:
using Eq. (4.9-8), in which the Laplacian was computed in the frequency do- Image sharpening
in the frequency
main using Eq. (4.9-7). Scaling was done as described in connection with that
domain using the
equation. We see by comparing Figs. 4.58(b) and 3.38(e) that the frequency do- Laplacian.
main and spatial results are identical visually. Observe that the results in these
two figures correspond to the Laplacian mask in Fig. 3.37(b), which has a -8 in
the center (Problem 4.26). ■
288 Chapter 4 ■ Filtering in the Frequency Domain
a b
FIGURE 4.58
(a) Original,
blurry image.
(b) Image
enhanced using
the Laplacian in
the frequency
domain. Compare
with Fig. 3.38(e).
with
where HLP (u, v) is a lowpass filter and F(u, v) is the Fourier transform of
f(x, y). Here, fLP (x, y) is a smoothed image analogous to f(x, y) in Eq. (3.6-8).
Then, as in Eq. (3.6-9),
Using Eq. (4.9-1), we can express this result in terms of a highpass filter:
where k1 Ú 0 gives controls of the offset from the origin [see Fig. 4.31(c)] and
k2 Ú 0 controls the contribution of high frequencies.
■ Figure 4.59(a) shows a 416 * 596 chest X-ray with a narrow range of inten- EXAMPLE 4.21:
sity levels. The objective of this example is to enhance the image using high- Image
enhancement
frequency-emphasis filtering. X-rays cannot be focused in the same manner
using high-
that optical lenses are focused, and the resulting images generally tend to be frequency-
slightly blurred. Because the intensities in this particular image are biased emphasis filtering.
toward the dark end of the gray scale, we also take this opportunity to give
an example of how spatial domain processing can be used to complement
frequency-domain filtering.
Figure 4.59(b) shows the result of highpass filtering using a Gaussian filter Artifacts such as ringing
are unacceptable in med-
with D0 = 40 (approximately 5% of the short dimension of the padded ical imaging. Thus, it is
image). As expected, the filtered result is rather featureless, but it shows faint- good practice to avoid
using filters that have the
ly the principal edges in the image. Figure 4.59(c) shows the advantage of high- potential for introducing
emphasis filtering, where we used Eq. (4.9-15) with k1 = 0.5 and k2 = 0.75. artifacts in the processed
image. Because spatial
Although the image is still dark, the gray-level tonality due to the low-frequency and frequency domain
components was not lost. Gaussian filters are
Fourier transform pairs,
As discussed in Section 3.3.1, an image characterized by intensity levels in a these filters produce
narrow range of the gray scale is an ideal candidate for histogram equaliza- smooth results that are
void of artifacts.
tion. As Fig. 4.59(d) shows, this was indeed an appropriate method to further
enhance the image. Note the clarity of the bone structure and other details
that simply are not visible in any of the other three images. The final enhanced
image is a little noisy, but this is typical of X-ray images when their gray scale
is expanded. The result obtained using a combination of high-frequency em-
phasis and histogram equalization is superior to the result that would be ob-
tained by using either method alone. ■
a b
c d
FIGURE 4.59 (a) A chest X-ray image. (b) Result of highpass filtering with a Gaussian
filter. (c) Result of high-frequency-emphasis filtering using the same filter. (d) Result of
performing histogram equalization on (c). (Original image courtesy of Dr. Thomas R.
Gest, Division of Anatomical Sciences, University of Michigan Medical School.)
ᑣ E z(x, y) F = ᑣ E ln f(x, y) F
process.
= ᑣ E ln i(x, y) F + ᑣ E ln r(x, y) F
(4.9-19)
or
where Fi (u, v) and Fr (u, v) are the Fourier transforms of ln i(x, y) and
ln r(x, y), respectively.
We can filter Z(u, v) using a filter H(u, v) so that
By defining
and
Finally, because z(x, y) was formed by taking the natural logarithm of the
input image, we reverse the process by taking the exponential of the filtered
result to form the output image:
g(x, y) = e s(x,y)
where
and
FIGURE 4.60
Summary of steps
f(x, y) ln DF T H (u, v) (DF T)1 exp g (x, y)
in homomorphic
filtering.
The filtering approach just derived is summarized in Fig. 4.60. This method
is based on a special case of a class of systems known as homomorphic systems.
In this particular application, the key to the approach is the separation of the
illumination and reflectance components achieved in the form shown in
Eq. (4.9-20). The homomorphic filter function H(u, v) then can operate on
these components separately, as indicated by Eq. (4.9-21).
The illumination component of an image generally is characterized by slow
spatial variations, while the reflectance component tends to vary abruptly, par-
ticularly at the junctions of dissimilar objects. These characteristics lead to as-
sociating the low frequencies of the Fourier transform of the logarithm of an
image with illumination and the high frequencies with reflectance. Although
these associations are rough approximations, they can be used to advantage in
image filtering, as illustrated in Example 4.22.
A good deal of control can be gained over the illumination and reflectance
components with a homomorphic filter. This control requires specification of
a filter function H(u, v) that affects the low- and high-frequency components
of the Fourier transform in different, controllable ways. Figure 4.61 shows a
cross section of such a filter. If the parameters gL and gH are chosen so that
gL 6 1 and gH 7 1, the filter function in Fig. 4.61 tends to attenuate the con-
tribution made by the low frequencies (illumination) and amplify the contri-
bution made by high frequencies (reflectance). The net result is simultaneous
dynamic range compression and contrast enhancement.
The shape of the function in Fig. 4.61 can be approximated using the basic
form of a highpass filter. For example, using a slightly modified form of the
Gaussian highpass filter yields the function
D(u, v)
4.9 ■ Image Sharpening Using Frequency Domain Filters 293
where D(u, v) is defined in Eq. (4.8-2) and the constant c controls the
sharpness of the slope of the function as it transitions between gL and gH.
This filter is similar to the high-emphasis filter discussed in the previous
section.
■ Figure 4.62(a) shows a full body PET (Positron Emission Tomography) EXAMPLE 4.22:
scan of size 1162 * 746 pixels. The image is slightly blurry and many of its Image
enhancement
low-intensity features are obscured by the high intensity of the “hot spots”
using
dominating the dynamic range of the display. (These hot spots were caused by homomorphic
a tumor in the brain and one in the lungs.) Figure 4.62(b) was obtained by ho- filtering.
momorphic filtering Fig. 4.62(a) using the filter in Eq. (4.9-29) with
gL = 0.25, gH = 2, c = 1, and D0 = 80. A cross section of this filter looks Recall that filtering uses
image padding, so the fil-
just like Fig. 4.61, with a slightly steeper slope. ter is of size P * Q.
Note in Fig. 4.62(b) how much sharper the hot spots, the brain, and the
skeleton are in the processed image, and how much more detail is visible in
this image. By reducing the effects of the dominant illumination components
(the hot spots), it became possible for the dynamic range of the display to
allow lower intensities to become much more visible. Similarly, because the
high frequencies are enhanced by homomorphic filtering, the reflectance
components of the image (edge information) were sharpened considerably.
The enhanced image in Fig. 4.62(b) is a significant improvement over the
original. ■
a b
FIGURE 4.62
(a) Full body PET
scan. (b) Image
enhanced using
homomorphic
filtering. (Original
image courtesy of
Dr. Michael
E. Casey, CTI
PET Systems.)
294 Chapter 4 ■ Filtering in the Frequency Domain
where Hk(u, v) and H-k(u, v) are highpass filters whose centers are at (uk, vk)
and (-uk, -vk), respectively. These centers are specified with respect to the
TABLE 4.6
Bandreject filters. W is the width of the band, D is the distance D(u, v) from the center of the filter, D0 is the
cutoff frequency, and n is the order of the Butterworth filter. We show D instead of D(u, v) to simplify the
notation in the table.
W W 1
H(u, v) =
H(u, v) = 1 - e - C D
0 if D0 - … D … D0 + 2n D2 - D02 2
H(u, v) = c 2 2 DW DW
1 + B 2 R
1 otherwise D - D20
4.10 ■ Selective Filtering 295
a b
FIGURE 4.63
(a) Bandreject
Gaussian filter.
(b) Corresponding
bandpass filter.
The thin black
border in (a) was
added for clarity; it
is not part of the
data.
center of the frequency rectangle, (M/2, N/2). The distance computations for
each filter are thus carried out using the expressions
and
For example, the following is a Butterworth notch reject filter of order n, con-
taining three notch pairs:
3
1 1
HNR(u, v) = q B RB R (4.10-5)
k=1 1 + [D0k >D k (u, v)]2n
1 + [D0k>D -k(u, v)]
2n
where Dk and D-k are given by Eqs. (4.10-3) and (4.10-4). The constant D0k is
the same for each pair of notches, but it can be different for different pairs.
Other notch reject filters are constructed in the same manner, depending on
the highpass filter chosen. As with the filters discussed earlier, a notch pass fil-
ter is obtained from a notch reject filter using the expression
As the next three examples show, one of the principal applications of notch
filtering is for selectively modifying local regions of the DFT. This type of pro-
cessing typically is done interactively, working directly on DFTs obtained
without padding. The advantages of working interactively with actual DFTs
(as opposed to having to “translate” from padded to actual frequency values)
outweigh any wraparound errors that may result from not using padding in
the filtering process. Also, as we show in Section 5.4.4, even more powerful
notch filtering techniques than those discussed here are based on unpadded
DFTs. To get an idea of how DFT values change as a function of padding, see
Problem 4.22.
296 Chapter 4 ■ Filtering in the Frequency Domain
EXAMPLE 4.23: ■ Figure 4.64(a) is the scanned newspaper image from Fig. 4.21, showing a
Reduction of prominent moiré pattern, and Fig. 4.64(b) is its spectrum. We know from
moiré patterns
using notch
Table 4.3 that the Fourier transform of a pure sine, which is a periodic func-
filtering. tion, is a pair of conjugate symmetric impulses. The symmetric “impulse-like”
bursts in Fig. 4.64(b) are a result of the near periodicity of the moiré pattern.
We can attenuate these bursts by using notch filtering.
a b
c d
FIGURE 4.64
(a) Sampled
newspaper image
showing a
moiré pattern.
(b) Spectrum.
(c) Butterworth
notch reject filter
multiplied by the
Fourier
transform.
(d) Filtered
image.
4.10 ■ Selective Filtering 297
Figure 4.64(c) shows the result of multiplying the DFT of Fig. 4.64(a) by a
Butterworth notch reject filter with D0 = 3 and n = 4 for all notch pairs. The
value of the radius was selected (by visual inspection of the spectrum) to en-
compass the energy bursts completely, and the value of n was selected to give
notches with mildly sharp transitions. The locations of the center of the notch-
es were determined interactively from the spectrum. Figure 4.64(d) shows the
result obtained with this filter using the procedure outlined in Section 4.7.3.
The improvement is significant, considering the low resolution and degrada-
tion of the original image. ■
■ Figure 4.65(a) shows an image of part of the rings surrounding the planet EXAMPLE 4.24:
Saturn. This image was captured by Cassini, the first spacecraft to enter the Enhancement of
corrupted Cassini
planet’s orbit. The vertical sinusoidal pattern was caused by an AC signal su-
Saturn image by
perimposed on the camera video signal just prior to digitizing the image. This notch filtering.
was an unexpected problem that corrupted some images from the mission.
Fortunately, this type of interference is fairly easy to correct by postprocessing.
One approach is to use notch filtering.
Figure 4.65(b) shows the DFT spectrum. Careful analysis of the vertical axis
reveals a series of small bursts of energy which correspond to the nearly sinusoidal
a b
c d
FIGURE 4.65
(a) 674 * 674
image of the
Saturn rings
showing nearly
periodic
interference.
(b) Spectrum: The
bursts of energy
in the vertical axis
near the origin
correspond to the
interference
pattern. (c) A
vertical notch
reject filter.
(d) Result of
filtering. The thin
black border in
(c) was added for
clarity; it is not
part of the data.
(Original image
courtesy
of Dr. Robert
A. West,
NASA/JPL.)
298 Chapter 4 ■ Filtering in the Frequency Domain
a b
FIGURE 4.66
(a) Result
(spectrum) of
applying a notch
pass filter to
the DFT of
Fig. 4.65(a).
(b) Spatial
pattern obtained
by computing the
IDFT of (a).
4.11 Implementation
We have focused attention thus far on theoretical concepts and on examples of
filtering in the frequency domain. One thing that should be clear by now is that
computational requirements in this area of image processing are not trivial.
Thus, it is important to develop a basic understanding of methods by which
Fourier transform computations can be simplified and speeded up. This sec-
tion deals with these issues.
where
N-1
F(x, v) = a f(x, y) e-j2pvy>N (4.11-2)
y=0