(Image Processing Book) - Jain - Fundamentals of Digital Image Processing - Prentice Hall 1989
(Image Processing Book) - Jain - Fundamentals of Digital Image Processing - Prentice Hall 1989
• • •
•
•
•
•
ANDERSON & MOORE Optimal Control: Linear Quadratic Methods
ANDERSON & MOORE Optimal Filtering
ASTROM & WITIENMARK Computer-Controlled Systems: Theory and Design, liE
DICKINSON Systems: Analysis, Design and Computation
GARDNER Statistical Spectral Analysis: A Nonprobabilistic Theory
GooDWIN & SIN Adaptive Filtering. Prediction. and Control
GRAY & DAVISSON Random Processes: A Mathematical Approach jar Engineers
HAYKIN Adaptive Filter Theory. 21E
JAIN Fundamentals of Digital Image Processing
JOHNSON Lectures on Adaptive Parameter Estimation
KAILATH Linear Systems
KUMAR & VARAIYA Stochastic Systems: Estimation, Indentification, and Adaptive
• Control
KUNG VLSI Array Processors
•
KUNG, WHITEHOUSE, & VLSI and Modern Signal Processing •
KAlLATH, EDS.
KWAKERNAAK & SIVAN Modem Signals and Systems
LANDAU System Identification and Control Design Using PlM+ Scftware
L!UNG System Identification: Theory for the User
MACOVSKI Medical imaging Systems
MIDDLETON & GOODWIN Digital Control and Estimation: A Unified Approach •
NARENDRA & Stable Adaptive Systems
ANNASWAMY
SASTRY & BODSON Adaptive Control: Stability. Convergence, and Robustness
SOLIMAN & SRINATH Continuous and Discrete Signals and Systems
SPILKER Digital Communications by Sate/lite
WILLIAMS Designing Digital Filters .
•
•
,
•
•• •
Fundamentals
of Digital
Image Processing
.• . •
,,
. ,
ANIL K. JAIN
•
c
\
\
• •
• •
,
.,
•
i
•
,
•
-
\
•
\
'.
•
•
•
• •
Editorial/production supervision: Colleen Brosnan
Manufacturing buyer: Mary Noonan
•
Page layout: Martin Behan
Cover design: Diane Saxe 1
Logo design: A.M. Bruckstein
Cover an: Halley's comet image by the author
reconstructed from data gathered by NASA:$
Pioneer Venus Orbiter in 1986.
•
•
Printed in the United States of America
10 9
ISBN 0-13-33b1b5-9
•
•
_ - - 1ill'~
Contents
•
, PREFACE xix
•
ACKNOWLEDGMENTS XXI
1 INTRODUCTION 1
2.10 '
D iscrete Rcancom
rl ~-lelU>
T" "'_ 35
Definitions.
"
35
Separable and Isotropic Covariance •
Functions, 36
•
. 2.11 The Spectral Density Function 37
Properties of the SDF, 38
2.12 Some Results from Estimation Theory 39 •
Mean Square Estimates, 40
The Orthogonality Principle, 40
2.13 Some Results from Information Theory 41
Information, 42
Entropy, 42
The Rate Distortion Function, 43 I •
Problems 44 ,
•
Bibliography 47
•
"
3 IMAGE PERCEpTION 49
3.1 Introduction 49
3.2 Light, Luminance, Brightness, and Contrast 49
•
Simultaneous Contrast, 51
Mach Bands, 53
•
vi Contents .
3.3 MTF of the Visual System 54
•
3.4 The Visibility Function 55
3.5 Monochrome Vision Models 56
3.6 Image Fidelity Criteria 57
3.7 Color Representation 60
3.8 Color Matching and Reproduction 62
Laws of Color Matching, 63
Chromaticity Diagram, 65
3.9 Color Coordinate Systems 66
3.10 Color Difference Measures 71
•
3.11 Color Vision Model 73
3.12 Temporal Properties of Vision 75 ,
i
Bloch's Law, 75 ..
Critical Fusion Frequency (CFF), 75
Spatial versus Temporal Effects, 75
Problems 76
• Bibliography 78
•
Frequencies, 87
Sampling Theorem, 88 ,.
;,
• Remarks,89
•
4.3 Extensions of Sampling Theory 89
Sampling Random Fields, 90
Sampling Theorem for Random Fields, 90
•
Remarks,90 •
Nonrectangular Grid Sampling and
Interlacing, 91 •
Hexagonal Sampling, 92
Optima! Sampling, 92 •
.,
Contents VII
4.4 Practical Limitations in Saznpling and Reconstruction 93
Sampling Aperture, 93
Display Aperturcllnterpolation Function, 94
Lagrange Interpolation, 98 . •
Moire Effect and Flat Field Response, 99
4.5 Image Quantization 99
•
4.6 The Optimum Mean Square or Lloyd-Max Quantizer 101
The Uniform Optimal Quantizer, 103
Properties of the Optimum Mean Square
Quantize~ '103 .
Proof', 112
4.7 A Compandor Design 113
Remarks, 114 •
viii Contents
•
, ,
Contents ix
•
x • Contents
•
I • •
Log-Ratios, 261
\
Principal Components, 261 •
• ,.
•
.
,
• • •
Contents
•
. 8.7'". I
,
Smoothtng Splines and Interpolation . 295
Remarks, 297
, •
8.8 .' Least Squares Filters 297
Constrained Least Squares Restoration, 297"
Remarks, 298
• •
;f•• " 8.9 •) Generalized Inverse, SVD, and Iterative Methods 299
.., "
",.. ~., The Pseudoinverse, 299
•
Minimum Norm Least Squares (MNLS) ,
,
xii • Contents, •
8.11 Causa! Models and Recursive Filtering 307·
A Vector Recursive Filter, 308
Stationary Models, 3]0
Steady-State Filter, 310
A Two-Stage Recursive Filter, 310
A Reduced Update Filter, 310
Remarks, 311
8.12 Semicausal Models and Semirecursive Filtering 311
Filter Formulation, 312
8.13 Digital Processing of Speckle Images 313
Speckle Representation, 313
Speckle Reduction: N-Look Method, 315
Spatial Averaging of Speckle, 315 ,
Homomorphic Filtering, 315
Maximum Entropy Restoration 316
Distribution-Entropy Restoration, 317'
Log-Entropy Restoration, 318 . , ,
Bibliography 335 ,
, . •
Contents xiii
Transform Features 346
Edge Detection 347
Gradient Operators, 348
Compass Operators" 350
Laplace Operators and Zero Crossings, 3~1
Stochastic Gradients, 353
Performance of Edge Detection Operators, 355
Line and Spot Detection; 356
,,
, Contour Following, 358
Edge Linking and Heuristic Graph Searching,
358 •
Dynamic Programming, 359 ,
Hough Transform, 362
,
9.6 Boundary Representation 362
Chain Codes, 363
Fitting Line Segments, 364
BsSpline Representation, 364
Fourier Descriptors. 370
Autoregressive Models,' 374
•
• 9.8 Moment Representation 377
Definitions, 377
Moment Representation Theorem, 378 •
•
Moment Matching, 378
Orthogonal Moments, 379
Moment invariants, 380
Applications of Moment Invariants, 381
Texture 394
Statistical Approaches, 394
Structural Approaches, 398
•
Other Approaches, 399
•
• .Contents
XIV •
j In
._iY.12
, Scene Matching and Detection 400
Image Subtraction, 400
Template Matching and Area Correlailon, 400
Matched Filtering, 403
I Direct Search Methods, 404
. ,
10 IMAGE RECONSTRUCTION FROM PROJECTIONS' • 431
,
•
J:hO.l Introduction 431
, Transmission Tomography, 431
Reflection Tomography, 432 \
Emission Tomography, 433
• Magnetic Resonance Imaging, 434
Projection-based Image Processing, 434 \
Definition, 434 •
Notation. 436
Properties of the Radon Transform. 437
•
The Back..projection Operator 439
Definition. 439
Remarks, 440 •
. Transform, 447
•
Contents
'----/--- /
/
\
•
xvi 'Contents
•
•
11.2 Pixel Coding 479
PCM,480
•
Entropy Coding, 480
Run-Length Coding, 481
Bit-Plane Encoding, 483
,
.,
Contents XVII
,
Problems 557
Bibliography 561
,
• •
INDEX 566
\
•
,,
, . •,
, :
• •
•
, •
•
•
xviii Contents
"
,,
, . ,
Preface
I
xix
•
form, linear systems, and some experience with matrix algebra. Typically, an entry
level graduate course in digital signal processing is sufficient. Chapter 2 of the text
includes much of the mathema ~cal background that is needed in the rest of the book.
A student who .masters Chapter 2' should be able to handle most of the image'
processing problems discussed in the text and elsewhere in the image processing
literature. •
The advanced course (Image Processing II) covers Sections 2.9, 2.13, and
selected topics from Chapters 6, 8, 9, 10, and 11. Both the courses are taught using
visual aids such as overhead transparencies and slides to maximize discussion time
and to minimize in-class writing time while maintaining a reasonable pace. In the
advanced course, the prerequisites include Image Processing I and entry level gradu-
ate coursework in linear systems and random signals.
Chapters 3 to 6 cover the topic of image representation. Chapter 3 is devoted to
low-level representation of visual information such as luminance, color, and spatial
and temporal properties of vision.' Chapter 4 deals with image digitization, an
essential step for digital processing. In Chapter 5, images are represented as series
expansion of orthogonal arrays or basis images. In Chapter 6, images are considered
as random signals.
Chapters 7 through 11 are devoted to image processing techniques based on
representations developed in the earlier chapters. Chapter 7 is devoted to image
enhancement tcchniques;u topic of considerable importance in the practice of image
processing. This is followed by a chapter on image restoration that deals with the
theory and algorithms for removing degradations in images. Chapter 9 is concerned ..
with the end goal of image processing, that is, image analysis. A special image
restoration problem is image reconstruction from projections-a problem of im-
mense importance in medical imaging and nondestructive testing of objects. The .
theory and techniques of image reconstruction are covered in Chapter 10. Chapter 11 .
is devoted to image data compression-a topic of fundamental importance in image
communication and storage.
Each chapter concludes with a set of problems and annotated bibliography. The
problems either go into the details or provide the extensions of results presented in
the text. The problems marked with an asterisk (*) involve computer simulations.
The problem sets give readers an opportunity to further their expertise on the
relevant topics in image processing. The annotated bibliography provides a quick
survey of the topics for the enthusiasts who wish to pursue the subject matter in
greater depth.
•
Supplementary Course Materials
Forthcoming with this text is an instructors manual that contains solutions to selected'
problems from the text, a list of experimental laboratory projects,. and course
syllabus design suggestions for various situations•
•
"
• •
Preface .
.'
Acknowledgments
I am deeply indebted to the many people who have contributed in making the
completion of this book possible. Ralph Algazi, Mike Buonocore, Joe Goodman,
Sarah Rajala, K R. Rao, Jorge Sanz, S. Srinivasan, John Woods, and Yasuo Yoshida
carefully read 'portions of the manuscript and provided important feedback. Many
graduate students, especially Siamak Ansari, Steve Azevedo; Jon Brandt, Ahmed
Darwish, Paul Farrelle, Jaswant Jain, Phil Kelly, David Paglieroni, S. Ranganath,
John Sanders, S. H. Wang, and Wim Van Warmerdam provided valuable inputs
through many examples and experimental results presented in the text. Ralph Algazi
and his staff, especially Tom Arons and Jim Stewart, have contributed greatly
through their assistance in implementing the computer experiments at the UCD
Image Processing Facility and the Computer Vision Research Laboratory. Vivien
Braly and Liz Fenner provided much help in typing and organizing several parts of
the book. Colleen Brosnan and Tim Bozik of Prentice Hall provided the much-
needed focus and guidance for my adherence to a schedule and editorial assistance
that is required in the completion of a text to bring it to market. Thanks are also due
to Tom Kailath for his enthusiasm for this work.
Finally, I would like to dedicate this book tqmy favorite image, Mohini, my
children Mukul, Malini, and Ankit, and to my patents, all, for their constant support
•
and encouragement.
. ,
• •
•
xxi
,,
Introduction
• • •
1
•
.
,( Link to other
\~\~.
computers •
• .
•
~'-. ~ • •
\
'.
- •
•
•
. Image scanner
\, XY digitizer
i Photo
(AID converter) transparency
I Large disk I
4
I CRT
High
resolution
Photo
camera ~
• •
Terminals Large digital image Color
,~U buffet and processor •
monitors
• I (network)
I •
•
•
t Hardcopy
IGraphics •
Printer plotter rape drive Disks terminal
Iworkstation .
•
•
• •
.- II1_- ~IIII.
~ ~~
•
~I ""5/1 i -'-~Ii II
! ' i>; 11
•
Figure 1.1 A digital image processing system (Signal and Image Processing
Laboratory, University of California, Davis).
.
.... •
Sample Digital . I' Online
Imaging and storage Digital
system computer Display
quantize (disk) \.buffer
..... ~
•
Output
Object Observe Digitize Store • Process Refreshl
store
•
Rooord•
•
.•
,
; 'I
~,
• •• •
• l
i
(a) Space probe Images: Moon and Mars. (b) multispectral images: visual and infrared.
•
•
~ " c·,
. , c',
HUHHUU
t
•
. " .. ;'
".<:,~,
" "J
- /
Ie) medical images: Xray and eyeball. (dJ optical camera Images: Golden Gate and
,
•
downtown San Francisco.
\ "
,
.
•
i-
•
•
•
Sec. 1.1 Digital Image Processing: Problems and Applications • 3
1. Image representation and modeling'
2. Image enhancement
3. Image restoration
4. Image analysis
.5. Image reconstruction
6. Image data compression
••
. ~" •
I Perceptio n mOdels::I .
.
Local models
•
Global models
,
4 lntroductlcn- Chap. 1
•
, •
.
"•
=03'1 1
•
,8, • 1 BM,N
i- 1
,
1
1- 1 .t i r- 1 B
i- 1 C a
t I
,
D a i D A
,
I
A
0
i +1 F
H E I
In image enhancement, the goal is to accentuate certain image features for subse-
quent analysis or for image display. Examples include contrast and edge enhance-
ment, pseudocoloring, noise filtering, sharpening, and magnifying. Image enhance-
ment is useful in feature extraction, image analysis, and visual information display.
• The enhancement process itself does not increase the inherent information content
in the data. It simply emphasizes certain specified image characteristics. Enhance-
ment algorithms are generally interactive and application-dependent:
Image enhancement techniques, such as contrast stretching, map each grry.
level into another gray level b:l R predetermined transformation, An example is the
histogram equalization method, where the input gray levels are mapped so that the
output gray level distribution is uniform. This has been found to be a powerful
method of enhancement oflow contrast images (see Fig. 7.14). Other enhancement
techniques perform local neighborhood operations as in convolution, transform,
operations as in the discrete Fourier transform, and other operations as in pseudo-
coloring where a gray level image is mapped into a color image by assigning differ-
ent colors to different features. Examples and details of these techniques are
considered in Chapter 7. .
• •
6 •
Introduction . Chap. 1
,
f(o:, 13) g(x, y)
, ...-Noise
h(x,y;a,p!
,
Imaging l--Output
Input-""" system
e o
Figure 1.7 Blurring due to an imaging system. Given the noisy and blurred im-
age the image restoration problem is to find an estimate of the input image f(x, y),
g(x,y) = rr
-<'Xl -ro
h(x,y;cx;I3)!(a, l3)dadl3 +,,(x,y) (1.1)
where ,,(x, y) is the additive noise function, I( a, 13) is the object, g (x, y) is the image,
and h (x, y; a, 13) iscalled the point spread function (PSF). A typical image restora-
tion problem is to find an estimate off(a, 13) given the PSF, the blurredimage, and
the statistical properties of the noise process.
•
, A fundamental result in filtering theory used commonly for image restoration
, is called the Wiener filter. This filter gives the best linear mean square estimate of
- the object from the observations. It can be implemented in frequency domain via
the fast unitary transforms, in spatial domain by two-dimensional recursive tech-
niques similar to Kalman filtering, or by FIR nonrecursive filters (see Fig. 8.15). It
can also be implemented as a semirecursive filter that employs a unitary transform
in one of the dimensions and a recursive filter
, in the other.
Several other image restoration methods such as least squares, constrained
least squares, and spline interpolation methods can be shown to belong to the class
of Wiener filtering algorithms. Other methods such as maximum likelihood, max-
imum entropy, and maximum a posteriori are nonlinear techniques that require
iterative solutions. These and other algorithms useful in image restoration' are
discussed in Chapter 8. '. ,
/ffQU!~\
Hopper
Camera
•
I Reject
I
,I
p
--+-->- Accept
•
•
measuring the size and. orientation of blood cells in a medical image. More
advanced image analysis systems measure quantitative information and use it to
make a sophisticated decision, such as controlling the arm of a robot to move an
object after identifying it or navigating an aircraft with the aid of images acquired
along its trajectory. " ". " " . "
Image analysis techniques require extraction of certain features that aid in the
identification of the object. Segmentation techniques are used to isolate the desired"
object from the scene so that measurements can be made on it subsequently.
Quantitative measurements of object features allow classification and description of
the image. Thesetechniques are considered in Chapter 9.
" Image reconstruction from projections is a special class of image restoration prob-
lems where a two- (or higher) dimensional object is reconstructed from several
one-dimensional projections. Each projection is obtained by projecting a parallel
X ray (or other penetrating radiation) beam through the object (Fig. 1.9). Planar
projections are thus obtained by viewing the object from many different angles.
Reconstruction. algorithms derive an image of a thin axial slice of the object, giving
an inside view otherwise unobtainable without performing extensive surgery,' Such
techniques are important in medical imaging (CT scanners), astronomy, radar imag-
ing, geological exploration, and nondestructive testing of assemblies.
Mathematically, image reconstruction problems can be set up in the frame-
work of Radon transform theory. This theory leads to several'useful reconstruction
algorithms, details of which are discussed in Chapter 10.
•
8 Introduction Chap.. 1
<.iJ
I •
I
____ J .II
----------f--- Projections
\
\
!-+-+t-'::-...."-,,,~-+----
Reconstruction
~~r----r x~~
algorithm
~--T---t
The amount of data associated with visual information is so large (see Table Lla)
that its storage would require enormous storage capacity. Although the capacities of
several storage media (Tablel.1b) are substantial, their access speeds are usually
inversely proportional to their capacity. Typical television images generate data
rates exceeding 10 million bytes per second. There are other image sources that
generate even higher data rates. Storage and/or transmission of such data require
large capacity and/or bandwidth, which could be very expensive. Image data com-
pression techniques are concerned with reduction of the number of bits required to
store or transmit images without any appreciable loss of information. Image trans-
. \
• TABLE 1.1b Storage Capacities
lin Millions of Bytes)
Human brain . 125,000,000
Magnetic cartridge 250,000
Optical disc memory 12,500
•
• Magnetic disc 760
2400-ft magnetic tape 200
Floppy disc 1.25
. . Solid-state memory modules 0.25
mission applications are in broadcast television; remote sensing via satellite, air-
craft, radar, or sonar; teleconferencing; computer communications; and facsimile
transmission. Imagestoraje is required. most commonly for educational and busi-
ness documents, medical images used in patient monitoring systems, and the like.
Because of their wide applications, data compression is of great importance in
digital image processing. Various image data compression techniques and examples
•
are discussed in Chapter 11.
BIBLIOGRAPHY
For books and special issues of journals devoted to digital imaging processing:
• •
••
,•
•
•
• •
•
•
•
•
•
•
•
•
10 Introduction· Chap; 1.
•
Two-Dimensional
.. - .
Systems -
and
Mathematical Preliminaries
2.1 INTRODUCTION
review
.-
of several fundamental results from !!1(iyix1l;li1oQ( that are important
. . .
in .
digital image processing theory. Two-dimensional random fields and some impor-,
tant concepts from probability and estimation theory are tllen reviewed. The
emphasis is on the final results and their appfications
. in image proCessing. It is
.
assumed that the reader has encountered most of these basic concepts earlier. The
summary discussion provided here is intended to serve as an easy reference for
subsequent chapters. The problems at the end of the chapter provide an oppor-
tunity to revise these concepts through special cases and examples. •
•
specify integer indices of arrays and vectors. The symbol roman j will represent
VCI. The complex conjugate of a complex variable such as z, will be denoted by
z*. Certain symbols will be redefined at appropriate places in the text to keep the
notation clear. .
Table 2.1 lists several well-known one-dimensional functions that will be often
encountered. Their two-dimensional versions are functions of the~.(;B~l'!Jf form
•
I(x, y) = Nx)f,(y) (2.1)
For example, the two-dimensional delta functions are defined as
_,Dirac: 8(x,y) =8(x)8(y)
,K.r~er; B(m, n) = 8(m)8(n)
.
image in the output plane due to an ideal point source at location (m', n ') in the
input plane. In our notation, the semicolon (;) is employed to distinguish the input
and output pairs of coordinates.
The impulse response is called th«~~ilJJ~r~!iu:ljij.lJ.dJJ)ll (PSF) when the inputs
.ii.rJ.d~'rtPutsIepresent a positive quantity such as the intensity of light in imaging
§ystem.b The term impulse response is more general and·is.allQwed to take negative
as well as complex values. The region 0,[ SUllP0rt of animpuIseJespon~jsthe
~allest dosed regi()ii~in the m, n plane outside which the impulse resI'0nse is zero.
A system is said to be a finite impulse response (FIR) OqlIl infinite imjlulse response
(IIR) system if its impulse response has finite or infinite regions of support, re-
spectively. " .
The output of any linear system can be obtained from its impulse response and
the input by applying the superposition rule of (2.6) to the representation of (2.4) as
follows:
,
- y(m,a):= ~[x(m, n)]
:} ,,'y(m,n)=2:2:x(m',n')h(m,njm',n') (2.8)
m ' n' _.. _~~~- ---_..-
•
!
x(m, n l - -.... 3C [. J If----;"''''Ylm, n)
•
•
Sec. 2.3 Linear Systems and Shift Invariance 13
•
•
which is called the COl1VQbl!;on of the input with th~jl7lPHlse tsspans». Figure 2.3
shows a graphical interpretation of this operation. The impulse response array is
0
rotated about the origin by 180 and then shifted by (m, n) and overlayed on the
array xtm', n '), The sum of the product of the arrays {x(·,.)} and {h(·,.)} in the
overlapping regions gives the result at im, n). We will use the symbol @ to denote
the convolution operation in both discrete and continuous cases, i.e.,
"
y(m,n)=h(m,n)@x(m,n)~ 'Z2: h(m-m',n-n')x(m',n') (2.11)
m'n' ,<tlI _tc>
a b
Figure 2.2 Examples of PSFs c d i.
(a) Circularly symmetric PSF of average
atmospheric turbulence causing small
.
•
blur; (b) atmospheric turbulence PSF
causing large blur; (c) separable PSF of a
diffraction limited system with square
aperature; (d) same as Ie) but with
smaller aperture.
• •
n' n'
n
him - m', n - d)
A
. I'A
I
~----_-m' m
B c
(a) impulse response (b) output at location (m, n I is the sum of product
of quantities in the area of overlap.
The convolution operation has several interesting properties, which are explored in
Problems 2.2 and 2.3.
Example 2.1 (Discrete convolution). .
Consider the 2 x 2 and 3 x 2 arrays Iitm, n) and x (in, n) shown next, where the boxed
element is at the origin. Also shown are the various steps for obtaining the convolution
of these two arrays. The result y(m, n) is a 4 x 3 array. In general, the convolution of
two arrays of sizes (M, x N,) and (M, x N,) yields an array. of size [(M, + M, -1) X
(N, + N, - 1») (Problem 2.5).
n n n n •
t 1 4 1 1 1 . t
_1[1]53 CD -1 ::}-1' IT]
lW,m l.j...=-,.--...,~", m .+-=1;=-----;1>0 m m
1
(a)x(m, n) (b)h(m,n) (c)h(-m,-n) (d)h(l-m, -n)
n n
A
I 5 5 1 •
• , •;; 10 5 2
::} ::}'
~I-~ ~
•
,[1]
, 3 -2 -3
_m ,
II m
'l• ' .
(e)y(l, 0)= -2+5=3 . (f) y(m, n)
• (2.13)
Table 2.3 gives a summary of the properties of the two-dimensional Fourier trans-
form. Some of these properties are discussed next.
. ,
F(~1'~2)= r -llCl
r
-00
f(x,y) exp(-j2'1TX~1)dx exp(-j27l'Y~2)dy
•
This means the two-dimensional transformation can be realized by a succes-
sion of one-dimensional
.
transformations. along each of the spatial
,
coordinates.
4. !:J:$.guency response and eigenfunctions oi.$1Jift inYClriaflt systern;s. An eigen:
function of~_~ystem is define,d as an input function that is reproduced at the
output with a possible change only in its amplitude. A fundamental property
of a linear shift invariant system is that its eigenfunctions are given by the-
complex exponential exp[J27l'(~lx + ~2Y)]' Thus in Fig.2A, for any fixed
(~1' ~), the output of the linear shift invariant system would be
•
.This theorem suggests that the convolution of two functions' may be evaluated
by inverse Fourier transforming the product of their Fourier transforms. The
discrete version of this theorem yields a fast Fourier transform based con-
volution algorithm (see Chapter 5).
The converse of the convolution theorem is that the Fourier transform of
the product of two functions is the convolution of their Fourier transforms.
. The result of convolution theorem can also be extended to the,SJ!atiqi
correlation between tworeal functions h (z, y) and lex, y), which is defined as
1 tiJ:. rJ~J(X' y)h* (x, y) dx dy = [r~ F(I,;I, ~)H' (I,;I'~} d~l dS2 (2.20)
i.e., the total energy in the function is the same as in its Fourier transform.
7. !;l1l11ls..el trans/arm., The Fourier transform of a circularly symmetric function is
also circularly symmetric and is given by what is called the Hankel transfo17'TJ...
(see Problem 2.10). . ' . . .-
Theinversetran~ormisgITenby
, (2.23)
• , .
18 Two-Dimensional
. Systems and. Mathematical Preliminaries Chap. 2
•
Note that X(w) is periodic with period 2n. Hence it is sufficient to specify it over
one period. . .. ,
The Fourier transform pair of a two-dimensional sequence x(rn, n) is defined
as . •
~. ----. -
F.- 00_
Now X(WI' (2) is periodic with period Zrr in each argument, i.e.,
X(WI + 271', W2';' 2'11") =: X(WI ± 2'11", Wz) = X(WI> Wz + 271') = X(WI, wz) (2.25)
Often, the sequence xtm, n) in the series in (2.24) is absolutely summable, i.e.,
.,
2:2: Ix(m, n)1 < 00 (2.26)
m, n "" -Q:l
Analogous to the continuous case, H (WI, Wz), the Fourier transform of the shift
invariant impulse response is called frequency response. The Fourier transform of
sequences has many properties similar to the Fourier transform of continuous
functions. These are summarized in Table 2.4. .
•
x(rn, n),y(m, n), n),'" hem, X(UlI' Ulo), Y(OOi> Ulz). Hew" Ulz).··'"
Linearity alXI (rn, n) + azxz(rn, n) a,XI (Ul" (2) + azX,(Ul" lllz)
Conjugation z" (rn, n) X" (-WI, -Ul2)
Separability Xl (rn)xz(n) Xl (WI) X 2 (w:)
Shifting xtm ± mo, n ± no) exp[±j(mOUll + no wz)J X (Ul" Ul,)
Modulation exp[±j(WOlm + Ul02n)]X (m, n) X (Ul. :;: Uln" w" '+' til(2)
Convolution y(m, n) = hem, n)@x(rn, n) Y (ult, Wz) "" H(Ul" wz)X (WI, (2)
Multiplication hem, n)x(m, n) (4~') H(w1, wz)@X(w"ooz)
Spatial correlation c(rn, n) "" h (rn, n)
~
*x(rn, n) C(UlI, (2) '" H(-w" -Wz)X(UlI, "'2)
. Inneroroduct ,1'" LL x(rn, n)y*(rn, n) . IJ""'"
I = 40T Z - -rr t,
X(Whw,,) Y' (WI, (02) .
dUll dUlz
Energy conservation 1Z
~ = 4'lT f" J'_",IX(Ull,
_'"
" . UlzW dUl,dUlZ
LL exp O(mUlOI + nUlJ)Z)] 2
4oT S (Ul1 - WOl, Ul2 - Woz)
•
S (m, n)
•
2.5 THE Z-TRANSFORM OR LAURENT SERIES
.
A useful generalization of the Fourier series is the Z -transform, which for a
two-dimensional complex sequence x(m, n)is defined as '.
00
where ZI ,Zz are complex variables. The set of values of ?1 ,Zz for which this series
converges uniformly is called the region ot convergence. The,;;;-transform_ of.the .
!!!!£.ulse response of a linear shift invariant discrete system is called its transfer
Jz:mction, Applying the convolution theorem for Z -transforms (Table 2.5) we can
transform (2.10) as , ," .
.
x(m, n}== pI )2
J-'lT,
ff X(ZI ,zz)z;n-I Z~-l dZ I dzzj'f (2.28)
1 . _. . . . _" .'.. oM.' •..•..• '' ..._ ---'"
where the contours of integration are counterclockwise and lie in the region of
convergence. When the region of convergence includes the unit circles IZI':=,
1, IZzl == 1, then evaluation of X(Z"ZZ) at Zl == exp(jUld,zz == expGWz). yields the
Fourier transform of x(m,n).. Sometimes X(ZI,ZZ) is available as a finite series
(such as the transfer function of fIR filters). Then x(m, n) can be obtained by
m n
inspection as the coefficient of the term zi zi •
x(m, n), y(m, n), htm, n), .. · X(z" Z,), Y(Z" Z2), H(z" Z2),'"
Rotation x(-m, -n) X(Z,I, zi l )
Linearity alxl (m, n) + a2x,(m, n) a1X1(Zt. Z,) + a2X Z(z" I2)
Conjugation X*{m, n) X*(zi, zn
Separability XI (m)X2(n) X, (Zl) X 2(Z2) \
\
Shifting x{m ± mo, n :!: no) Zl±mO Z 2:t:tto X(Zl, Zz)
. if X (~."~)
. (1)2
Multiplication x(m, n) y(m, n) , Y(' ')
Zit Zz dzi
, dz,
z
2 'lTj II Z" ZI Z2·
CIC2
•
20 Two-Dimensional Systems and Mathematical Preliminaries Chap. 2
Causality and Stability
1\ one-dimensional shift invariant system is called causalif its output at any' time is.
_
not... affected by future inputs. This means its impulse response h (n) ==
,~---
°for n < 0
.
and its transfer function must have a one-sided Laurent series, i.e.,
,, '"
I H(z) "" 2: h(n)z-' i ~
(2.29)
{ n""'O }-
. ' .
Extending this definition, any sequence x(n) is called causal if x(n)==O,n <0;
anticausal if x (n) == 0, n ;z: 0, and noncausal if it is neither causal nor anticausal,
A system is called Wlhle if its output remains uniformly bounded for any
bounded input. For linear shift invariant systems, this condition requires that the
impulse response should be absolutely summable (£rove it!), i.e.,
, ",--.,
-:-.. .,
00
,""'
'"
This means H(z) cannot have any poles on the unit circle Izi == 1. If this system is to
be causal and stable, then the convergence of (2.29) at IzI = 1 implies the series must
converge for all [z] > 1, l.e., the poles of H(z) must lie inside the unit circle.
In two dimensions, a linear shift invariant system is stable when
..- - .
•
. , . LL Ih(m,n)l<oo!, (2.31)
~~_" m.-E~ _ ;
which implies the region of convergence of H(Zl' zz) must include the unit circles,
i.e., IZII == 1, Izzi == 1.
•,
\
OTF=H(~l'~) (2.32)"
H(O, D)
The modulation transfer function (MTF) is defined as the magnitude of die OTF,
• •
I.e. "
• , '
(2.33) .
Similar relations are valid for discrete systems. Figure 2.5 shows the MTFs of
•
systems whose PSFs are displayed in Fig. 2.2. In practice, it is often the MTF that is
measurable. The phase of the frequency response is estimated from physical consid-
erations. For many. optical systems, the OTF itself is positive. '
l
I
{
..
,.,.
'y
i/
•
.'{'
'if
•
I
Example 2.2 •
The impulse response of an imaging system is given as h (x, y) = 2 sin']"ll'(x - xo)]/
['IT(x - xo)Y sin'["ll'(Y - YO)]/['ll'(Y - Yo)]'. Then its frequency response is H(~l ,~r=
2 tri(~l ,~)exp[-i2'lT(xO~' +Yo~)J, and OTF=tri(l;l'~) exp [-j2'll' (XOl;l + Yo~J,
MTF = tri (~ .. ~).
u(l)
u(2) "
u~{u(n)}=
•
•
(2.34)
•
u(N)
The nth element of the vector u is denoted by u(n), u., or [u] n Unless specified
»
otherwise, all vectors will be column vectors. A column vector of size N is also
called an N x 1 vector. Likewise, a row vector of size N is called a 1 x N vector.
1 2 3 3 5 -:-
m
•
••
be a one-to-one ordering of the elements of the array {x(m, ~)} into the vectcr ,e •
. For an M x N matrix, a mapping used often is called the lexicographic or dictionary
ordering. This is a row-ordered vector and is defined as
."'" .,vT A[x(l, l)x(l, 2) ... ~ (1, N)x(2, 1) ... x (2: N) .. .x (M, 1) .. :x (M, NW'
= Cr{x(m, n)} '. (2.36a) .
•
Thus x T is the row vector obtained by stacking each row to the right of the previous
row of X. Another useful mapping is the column by column stacking, which gives a
column-ordered veetora'S' .
•
. .,vT = [x(l, 1)x(2, 1) ... x(M, 1)x(1,2) ... x(M, 2) .. . x(l,M) .. . x(M, N)f
•
~ Cc{x(m, fin (2.36b)
•
XN
where x, is the nth column of X.
Transposition and Conjugation Rules
1. A*T=[AT ] .
2. [ABY=BTA T
3. [A-1y= [ATt l ,
4. [AB]* = A*B*
Note that the c,pnjU!ta1!I.Jranspose. is denoted by A*T. In matrix theory literature, a
simplified notation A" is often used to denote the conjugate transpose of A. In the
theory of image transforms (Chapter 5), WtLWjll h5\ye_to '~A, A",
Ar and A• T and hence the need for the notation.
Teeplltz and Circulant Matrices • •
A Toeplitz matrix T is a matrix that has constant elements along the main diagonal
and the subdiagonals. This means the elements t(m, n) depend only on the differ-
ence m- n, i.e., tern,
n)= tm - n , Thus an N x NToeplitz matrix is of theform
T= • •
•
(2.37)
• •
• t_1
tN-I'" to
and is completely defined by the (2N -1) elements {tk, -N + 1 <k:s,N -I}.
-. Toeplitz matrices describe the input-output transformations of one-dimensional
linear shift invariant systems (see Example 2.3) and correlation matrices of sta-
, .tionary sequences. -
A matrix C is called circulant if each of its rows (or columns) is a circular shift
of the previous row (or column), i.e., ..
• ,
Co C2 ..... CN-I
CN-I CN-Z
• •
C= • • (2.38)
• •
•
• C2
CI C2'" CN-I
•
Sec. 2.7 Matrix Theory Results •
25
•
•
Note that y (n) will be zero outside the interval -1:$1 n S 5. In vector notation, this can
be written as a 7 x 5 Toeplitz mat ix operating on a5 x 1 vector, namely"
y(-I) -1 0 .0 0 r-
0
yeO) 0 -1 0 0 0 x(O)
y(1) 1 0 -1 0 0 x(l)
y(2) ::::: 0 1 0 -1 0 x (2)
y(3) 0 0 1 0 -1 x (3)
y(4) 0 0
0
0
0 0
1 0 I x (4) .
y(5) L 0 1
(modulo 4). In vector notation this gives e:;:" --iF,"..!'..!\ i'! ~ (:.:':",
.
,
,._r
y(O) 3 2 1 0 x (0)
)'(1) = 0 3 2 1 x(l)
y(2) 1 0 3 2 x(2)
y(3) 2 1 0 3 x(3)
An cTthogonal matrix is such that its inverse is equal to its transpose, i.e., A is
orthogonal if .'
A-1=A T
or
• , (2.40)
•
or
•.!\.AH=A*TA = I (2.41)
A real orthogonal matrix is also unitary, but a unitary matrix need not be
orthogonal. Thepreceding definitions imply that the columns (or rows) ofan N X N
unitary matrix are orthogonal and form a complete set of basis vectors in an
Ni-dimensionai vector space.
Example 2.5
Consider the matrices
1[1
A = v2 1
t A, = [v2,2
-J
j
v2'
]
It is easy to check that At is orthogonal and unitary. A, is not unitary. A3 is unitary with
orthogonal rows.
Diagonal Forms
where {Ad and {<Pk} are the eigenvalues and eigenvectors, respectively, of R. For'
• , ,
dJ.= , (2.47)
• • •
, , •
is a block matrix where {Ai,i} are p x q matrices. The matrix sa. iscalled an m x n
block matrix of basic dimension p x q. If A i• i are square matrices (say, p x p), then
we also call ss to be an m x n block matrix of basic dimension p.
If the block structure is Toeplitz, (Aj,i = Ai-i) or circulant (Ai,i "" Aw-nmodulon),
m := n) then dJ. is called block Toeplitz or block circulant, respectively. Additionally,
if each block itself is Toeplitz (or circulant), then d is called doubly block Toeplitz
(or doubly block circulant). Finally, if {Ai,i} are Toeplitz (or circulant) but (~,i =F
Ai-i) then d is called a Toeplitz block (or circulant block) matrix, Note that a doubly
•
Toeplitz (or circulant) matrix need not be fully Toeplitz (or circulant), i.e. the
scaler elements of d need not be constants along the subdiagonals.
•
Example 2.6 .
Consider the two-dimensional convolution
-.
Z 1
, <y' =
yoy, = H,
Ho 0
Ho
[x o] J1
='i1e.r·
yz , 0 H, X,
3 1 0 1
, • ---
2 2 2
--.,. .......
0 h(m, n)
•
-.-
1 3 5
--- ---1 •
n =0 4 8
-- -- --
3 m
m= 0 1 2
In terms of column vectors of x(m, n) andy(m, n), we can write
. '
•
•
3
Yn=
flO
L0 Hn-n:xn"
Jilt
I. .Yo rH 0
.
H3 H2 a, Xvi i
_ 'Y, ._ H, .Ro H3 He X, ' li "J>
•
Jf - y, .-: fl, HI Ho H 3 X, = 41.'<: •
,Y3J H 3 H, H, HoJ X3
where H_ n = H...n. Now ::Jeis a doubly circulant, 4 x 4 block matrix of basic dimension
3 x 3.
Kronecker Products
•
30 Two-Dimensional Systems and Mathematical Preliminaries Chap. 2
4
ations to compute the left side, whereas only 0(1'1 ) operations are required to .
compute the right side. This principle is useful in developing fast algorithms for
multiplying matrices that can be expressed as Kronecker products. '
Example 2.8
Let
Separable Operations
m m
where [A ® Bh,,,, is the (k, m )th block of A ® B. Thus if U and V are row-ordered
into vectors a and a , respectively, then
V=AUB T ::} O'=(A®B)U'
i.e., the separable transformation of (2.50) maps into a Kronecker product' oper-
ating on a vector.
• ,.
2.9 RANDOM SIGNALS ,
•
Definitions
•
•
2
Variance ~ (J' ~(n) = 0' 2(n) '" .E[lu (n) .... iJ-(n ,)1 ] (2.52)
Covariance> Cov[u(n), u(n ')J ~ ru(n, n ')~ rtn, n ')
= E {(urn) - lJ.(n)][u· (n ~ - JJ. '(n ')J) (2.53)
Cross covarian~e ~ Cov[u(n), v(n ')] ~~u,(~,n i~-'-~
. = E{[u(n) - fJ.u(n)][v"'(n') -11:(n')]} (2.54)
Autocorrelation ~ auu(n, n ') d atn, n ') = E[u(n)u* (n ')]
= r(n, n') + /A-(n)/J-'(n ') (2.55)
Cross-correlation = au,(n, n ') = E[u(n)v* (n ')] = ru,(n, n') + /A-u(n)/A-: (n ') (2.56)
The symbol E denotes the mathematical ~ectatiQn. operator. Whenever there is
no confusion, we will drop the subscript u from the various functions. For an N x 1
vector u, its mean, covariance, and other properties are defined as
E[u] = IJ- = {f.I.(n)} is an N x 1 vector, (2.57)
Cov[n] ~E(n -IJ-)(U* -1J-*;r~Ru~R={r(n,n')} is anN x Nmatrix (2.58)
•
Cov[n, v]~ E(n - IJ-u)(v* -IJ-: r ~ R"" = {ruv(n, n ')} is an N x N matrix (2.59)
Now IJ- and R represent the mean vector and the covariance matrix, respectively, of
the vector u. .
•
•
Gaussian or Normal Distribution •
A sequence, possibly infinite, is called a Gaussian (or normal) random process if the
joint probability density of any finite sub-sequence is a Gaussian distribution. For
example, for a Gaussian sequence {u (n), 1 S n S N} the joint density would be
p. (II) = P. (Ul> lL2, •.• ,UN) = [(2'll') N12IRl lI2
r 1
exp{ -Y2 «(I - IJ-)*TR- 1 «(I - IJ-)} (2.61)
where R is the covariance matrix of u and is assumed to be nonsingular. .
•
Stationary Processes
This means the covariance and autocorrelation matrices 'are Hermitian and
nonnegative definite.
Markov Processes •
•
r •
• •
R= (2.68)
•
PN-l ... p
J
which is Toeplitz. !n fact the covariance and al.l1l:lwm;:lat.imunattices of ll~ stationm .
Jequ,ence are Toepli!f:. Conversely, any sequence, finite or infinite, can be called
stationary if its covariance and autocorrelation matrices are Toeplitz,
• •
Two random variables x and yare called inrJf;1?J-:tlg.ent if and only if their joint
probability density function is a product of their marginal densities, i.e.,
I, PX,y(x,'y r= pJr )py(y) 7
.
(2.69)
'TWo random sequences x(n) and yen) are called independent if and only if for every
nand n ', the random variablesx(n) and yen ') are independent. .
The random variables x. and yare said to beQl.fk~if.
. '
E[xy*] = 0 (2.70)
and are called.:;;;un;:..:c=.Qrr~kl{edjf
E[xy*] = (E[x ))(E[y*])
or
E[(x - J.Lx )(y - J.Ly )*J = 0 (2.71)
Thus zero mean uncorrelated random variables are also orthogonal.Gaussill!l r~
dom variables which are uncorrelated are also independent.
-'- ._~- - --,
-,
Let {x (n), 1 :s; n <.: N} be a complex random sequence whose autocorrelation matrix
is R. Let ell be an N x N unitary matrix, which reduces R to its diagonal form A [see
(2.44)]. The transformed vector .
1Y=<Il*Txj (2.72)
is called the Karhunen-Loeve (KL) transform of x, It satisfies the property .
E[yy*'1 = <Il*T{E[xx*7J}c!> = 4J*TRtb "" A
~ . E[y(k)y* (1)J "" AkO(k -1)/
, J
(2.73)
34· Two-Dimensional
• •
Systems and Mathematical Preliminaries • •
Chap. 2
•
•
i.e .. the elements of the transformed sequence y(k) are orthogonpl. If R represents,
• - M ••• _·"·····_··~
the covariance matrix rather than the autocorrelation matrix of A., then the ~uenc,~
y1k) is uncorrelated, The unitary matrix <f>*r is called the KL transform matrix. Its
tows are the conjugate eigenvectors of R, i.e., it is the conjugate transpose of the
eigenmatrix of R. The KL transform is of fundamental importance in digital signal
and image processing. Its applications and properties are considered in Chapter 5.
Definitions
The covariance function of a random field is called separable when it can be ex-
pressed as a product of covariance functions of one-dimensional sequences, I.e., if
rem, n; ,m', n') = rl(m, m ')r2(n, n ') .(Nonstationary case) (2.82)
rem, n) = rl(m)r2(n) (Stationary case) '(2.83)
A separable stationary covariance function often used in image processing is
rem, n) = u pili p~I,
2
!Pli < I, 1P21 < 1 (2.84) .
Here u 2 represents the variance of the random field and PI = r(l, 0)/u 2 ,
PI = reO, 1)/u 2 are the one-step correlations in the m and n directions, respectively.
Another covariance function often considered as more realistic' for many
images is the nonseparable exponential function
rem, n) = (1'2 exp{-y,..../X-lm-2=-+-(X2-n-=2}· (2.85)
When (Xl = et2 = 01., rem, n) becomes a function of the Euclidean distance d ~
;~_ , ·c.···_~._ ~
,I 2 2'
V m + n ,I.e.,
(2.86)
the given image data by replacing the ensemble averages by sample averages; for
example, for an M x Nimage u(m, n),
(2.87)
M-m N-n
rem, n)=f(m,n) = ~N mf:: .'1- [u(m', n') - l1][u(m +m', n +n')- III (2.88)
l 1
The covariance
... r.(n) is simply the inverse Fourier transform of the SDF, i.e.,
This shows
.
c_~_
~
. -v
-..- ., .
of r~~~,;_-" _-,-:~~:- ~~ _ t1fl
f
I
:!::~.~f I_
I . - - ;---/1 r r" ....- 4
1
I d,,=E[lu(m, n) - ,-----
fLlZj =,.(0,0) = 4~j
.,
t SuCWI ,W2} d wl d w2
-
! (2.95)
•
i.e., t e volume upder SU(0J!, (2) is equal to the average power in the random fielg
,u(m, n}. Therefore, p ysically S" w;~-w;) represents
"-' '---'~"'~ ,
the power density in the image.
at spatial frequencies (w] ,(2)' Hence,the SnF is also known as the power spectrum
density function or simply the power spectrum of the underlying random field.
Often the power spectrumis..defined as the f9uri~r transformof the aUlocorrelatio!!
-sequence rather than the covariance sequence. Unless stated otherwise, wewill.
continue to use the definitions based on covariances, ....
In the text whenever we refer to S,,(ZI, Z2} as the SDF, it is .implied that
Zl = exp(jwl), Z2 = exp(jw2)' ~hen a s.I>!e.(;lra! density function can be e!l'ressecLas a
Latio offinite polynomials in 41 and Z2 ,it is called.a.ratianal spectrum and it is of 1M
form K L
k~-K
2: I=-L
2:: b(k, l )zlk Zi l
M N . (2.96)
I I a(m,n)zlmzin
m=-M n=-N
Such SDFs are realized by linear systems represented by finite-order difference
equations.
m •
Example 2.12
, Consider the separable covariance function defined in (2.84). Taking the Fourier trans-
form, the SDF is found to be the separable function
,
. d'(1 - p1)(l- liz)
S(WhW,) = . ~ ". (2.100)
(1 + 1'1 - 2p, cos w, )(1 + Pi - Zpz cos,W2)
.
For the isotropic covariance function of (2.86), an
analytic expression for the SDF is
not available. Figure 2.7 shows displays of some SDFs.
,
Here we state some important definitions and results from estimation theory that
are useful in many image processing problems.
~ ~ :;
..• ,
,"
•
t
•
o
00
,t,f . •
1o
oo •
Let {y (n), 1;5, n :$ N} be a real random sequence andx be any real random variable.
It is desired to find x, called the optimum mefL,!t§f1lJ:E!! estimate of x, from an
observation of the random sequence y (n) such that the !,l)e,il,!l §q~1lf~ ep'Qt __",e"_"__ '" -- ,
f(T~~E[(x-x)JJ (2.101)
;,,__':-"':-_r__ ~~,~r~'--_~'
where PxiY(~) is the conditional probability density of x given the observation vector
•
y. If x and y (n) are independent, then i is simply the mean value of x. Note thati is
an unbiased estimate oU;. because
. ,. , 'II<
.
'"
.. ,
J)~/E[~L~:[E(xIY)] =E[x](,'
,
(2.103) .
For zero me~ Gaussian random variables. i turns.out tobeJinea.rin.y(n), i.e.,
?) i'" f
.~l
a(n)y(n) (2.104)
where the coefficients a(n) are determined by solving linear equations shown next.
•
40 •
Two-Dimensional Systems and Mathematical Preliminaries . Chap. 2
•
•
for any g(y) ~g(y(1),y(2),... ,yeN)),
E[(x - x)g(y)] = 0 . (2.105)
To prove this we write
E[£g(y)] == E[E(xly)g(y)] == E[E (xg (y)jy)] "" E[xg(y»)
which implies (2.105). Since £ is a function of y, this also implies
E[(x - x)ij "" 0 (2.106)
E[(x - £)g(£)] = 0 (2.107)
(2.108) .
•
Once again €X is given by (2.108), where Rt and rxy represent the covariance and
cross-covariance arrays. ff x, yen) are non-Gaussian, them (2.104) and (2.109) still
give the best linear mean square estimate. However, this is not necessarily the
conditional mean. ..
Information theory gives some important concepts that are useful in digital repre-
sentation of images. Some of these concepts will be used in image quantization
(Chapter 4), image transforms (Chapter 5), and image data compression (Chap-
ter 11). . .
•
Information
(2.112)
each PA::::; 1 and h is nonnegative. This definition implies that the information
conveyed is large when an unlikely message is generated.
•
Entropy I
The entropy of a source gives the lower bound on the number oibi1sreq.uited.
~o encode its output. In fact, according to Shannon's noiseless coding theorem [11,
12), it is possible to code without distortion a source of entropy H bits using an
average of H + ~ bits/message, where E> 0 is an arbitrarily small quantity. An
alternate form of this theorem staes that it is ...possible
- .-
to code the source with H-bits
" -. -
such that the distortion in the decoded message could be made arbitrarily small.
1 ,----~---
•
=----.
•
o I
2
I
• fil p
,
H = H(p) = -p log 2p - (1- p) logz(l- p)
,
The maximum entropy is 1 bit, which occurs when both the messages are equally likely.
Since the source JS binary, it is always possible to code the output using 1 bit/message.
However, if p <~, P '" k (say), then H < 0.2 bits, and Shannon's noiseless coding
theorem says it is possible to ~nd a noiseless coding scheme that requires only 0.2 bits!
message. .
.C1early
_
the •.maximum
.... _ .., - •..--
value
_
of D is equal to /72,
_
the variance of-,,' x.Figure 2.9 shows
~"""'--""-_H.'_'
.For a fixed average distortion D, the rate distortion function R D of the vector x is
given by
I N- I [ . O'k21
RD = NkLO max 0, ~ log2(j J (2.118)
•
•
PROBLEMS
r .., '-
2.1 a. Given a sequence u(m, n) '" (m +n)3.Evaluate u(m, n)o(m--l, II -2) and
u(m,n)@8(m-l,n-2). , ." •
b. Given a function I(x, y) '" (x + y)'. Evaluate I(x, y) II (x -1, y - 2) , and
[t», y)@o(x-l,y-2).
, 1 J~
c. Show that -> e~in. d6 '" 8(n).
2'11" -~
2.2 (Properties of Discrete Convolution) Prove the following:
a. h(m,n)@u(m,n)=u(m,n)@h(m,n) (Commutative)
b. h (m, n) @[alu, (m, n) + a, u, (m, n)] = a, [h (m, n) @u, (m, n)J
+a,[h(m,n)@u2(m,n)J (Distributive)
'-c. hem, n)@u(m - mo, n - no = hem - mo, n - no)@u(m, n) (Shift invariance)
d. hem, n)@[u, (m, n)@u2(m,n)] = [h (m, n)@il, (m, n)J@u,(m,n) (Associative)
e. hem, n)@8(m, n) =h(m, n) (Reproductive)
, '
. ~ i • ,
m {I
c. y(m, n) = 2:
'tn' """ _c:<l n'
2:
= ~c:t;
x(rn',Il')
•
d. y(m, n) =x(m - mo, n - nO) .
.e. y(m, n) =exp{-Ix(m, n)l}
1 1
f. ,y ( m, 11) =.::.-
'" .:." x (m' , n ') . ,
'"
",'--111'--1 ..,,,,,,,",
•
M- 1N - 1 2'l'imm'} { 2'l'inn'}
{
g. y(m, n)= 2: 2:
m .... Orj'=O
x(m', n')exp -j 'M exp -j 'N
1
2.5 a. Determine the convolution of x(m, n) of Example 2,1 with each of the following
arrays, where the boxed element denotes the (0, 0) location.'
l, 0 -1 1 Ii. m. -2
-1 rn-l IT]
2.6
2.7
2.8.
• .,;.V).......[ Scaling
_f_(X.,;. I flax, bV).[ 5' 1-.. . . . 1 5' Ig(x, vI. ,.,
•
Figure P2.9
(Hankel transform) . Show that in polar coordinates the two-dimensional Fourier trans-
form becomes
~ F(t; cos~, t; sin o) = L [fp(r, e) exp] -j2'l'irt; cos(e -<1»]r dr de.
2~
Fp(t;, <1»
.where jp (r, 1:1) = f(r cos 1:1, r sin e). Hence, show that if f(x, y) is circularly symmetric,
then its Fourier transform is also circularly symmetric and is given by . ,
(~ . 1
p,,(p) =21T Jo rjp(r)Jo(27rr p) dr, Jo(x)D.:z; 0 exp(-jxcos6) de L'"
The pair fp (r) and Fp (p) is called the Hankel transform pair of zero order.
2.11 Prove the properties of the Z- transform listed in Table 2.5.
2.12 For each of the following linear systems, determine the transfer function, frequency
response, OTF, and the MTF.
a. y(m, n) - Ply(m -1, n) - p,y(m, n -1) =x(m, n)
• b. y(m, n) - p,y(m -1, n) - p,y(m, n -1) + PIP,y(m -1, n -1) =x(m, n) .
2.13 What is the impulse response of each filter? . .
a. Transfer function is H, (z" z,) = 1- a, zl' - a,zi' - a3zi-1 zi' - a.,z, a:
b. Frequency response is H(<J>h w,) = 1- 20< COSuh - 2()( cosw,. •
•
Chap. 2 Problems 45
•
Z.14 a. Write the convolution of two sequences {1,2,3,4} and {-1,2, -I} as a Toeplitz
matrix operating on a 3 x 1 vector and then as a Toeplitz matrix operating on a 4 x 1
vector.
b. Write the convolution of two periodic sequences {1, 2, 3, 4, ...}and {-1, 2,-1,0, ...},
each of period 4, as a circulant matrix operating on a 4 x I vector that represents tlre
. first sequence. .
&a.triX trace and related formulas].
N
a. Show that for square matrices A and D, Tr[A] '" Tr[A T
] = L x, Tr[A + B] =
< i = I
Tr(A]+Tr[B], and Tr[AB] = Tr[BA] when Ai are the eigenvalues of A.
a. {a }
I). Define D.. (Y)~-Tr[YJ~j· ( )Tr[YJ. Then show that D.. (AB)= B T,
aA . laa m, n
D.. (ABA') = AB T + AB, and D.. (A -, BAC) = -(A -, ~4.CA-I)T + (CA -, By.
2.16 Express the two-dimensional convolutions of Problems 2.5(a) as a doubly Toeplitz
block matrix operating on a 6 x 1 vectcr obtained by column ordering of the x(m, n).
~~In the two-dimensional linear system of (2.8), thex(m, n) andy(m, n) are of size M x N
and are mapped into column ordered vectors x and y, respectively. Write this as a
. ,
matnx equation
y '" 'iJlx
and show 'ilf is an N x N block matrix of basic dimension M x M that satisfies the
properties listed in Table P2.17.
TABLE P2.17 Impulse Response (and Covariance) Sequences •
and Corresponding Block Matrix Structures
•
• BIBLIOGRAPHY
Sections 2.1-2.6
•
Sections 2.9-2.13
•
• .'
».
I
, t
•
•
•
•
•
48 Two-Dimensional Systems and Mathematical
,
Preliminaries Chap.2.
,
•
.
IIIl!liIllilt!!&SlIi$NZlSa:Jil,iill¢;lW"'L:;;<:;:;g:~'t'Miilliillil:llil1WZiIi!JillIiIG:'=¢li£;ii;i:i)J:~iZZllliill
Image Perception
3.1 INTRODUCTION
~
I J(A) = p(A)L(A) J . .
(3.1)
where p(A) represents the reflectivity or transmissivity of the object and L(A) is the
~bddent energy distribution. The illumination range over which the visual system
•can operate is roughly 1 to 1010, or 10 orders of magnitude.
The retina of the human eye (Fig. 3.1) contains two types of photoreceptors
called~. and '-fPJfs. The rods, about 100 million in number, are relatively long
and thin. They provide s.~o~ij;;,vision, which is the visual response at the lower
several orders of magnitude of illumination. The cones, many fewer in number
49
•
•
The eye
.' Iris
Fovea
Lens
--:ti~ ~oo
Retina
Optic
Cornea nerve
Figure 3.1 Cross section of the eye.
(about6.S million), are shorter and thicker and are less sensitive than th.e rods. They
.. : provide photopic vision, the visua! response at the higher 5 to 6 orders of magnitude.
of illumination (for instance, in a well-lighted room or bright sunlight). In the
intermediate region of illumination, both rods and cones are active and provide
mes,oj)ic vision. We are primarily concerned with the photopic vision, since elec-
tronic image displays are well lighted. .
The cones ~r:e ~!so responsible for color vision. They are densely packed in the
center of the retina (called fovea) at a density of about 120 cones per degree of arc
subtended in the field of vision. This corresponds to a spacing of about 0.5 min of
arc, or 2 urn. The density of cones falls off rapidly outside a circle of 1 radius from
0
the fovea. The pupil of the eye acts as an aperture. In bright light it is about 2mm in
diameter and acts as a low-pass filter (for green light) with a passband of about
60
"".cycles per degree.
..._-=--.. -,. . .
The cones are laterally connected by horizontal cells and have a forward
connection with bipolar cells. The bipolar cells are connected to ganglion cells, .
which join to form the optic nerve that provides communication to the central
nervous system.
The luminance or intensity of a spatially distributed object with light distribu-
tion I (r, y, A) is defined as
where V(A) is called the relative luminous efficiency function of the visual system.
For the human eye, V(A) is a bell-shaped curve (Fig. 3.2) whose characteristics
•
1.0
0.8
--:'":.. 0.6
0.4
•
0.2
, 0 •
•
depend on whether it is scotopic or photopic vision. I~~ luminance of an object ig,
independent of the luminances of the surrounding objects; The brightness (also
.ca1I"J app!!.refl~brightness) of an object is the perceived luminance anddepends on
The-luminance of the surround. Two objects wlth'different surroundings could have
. identical luminances but different brightnesses. The following visual phenomena
exemplify the differences between luminance and brightness.
Simultaneous Contrast
In Fig. 3.3a, the two smaller squares in the middle have equal luminance values, but
the one on the left appears brighter. On the other hand in Fig. 3.3b, the two squares
appear about equal in brightness although their luminances are quite different. The
reason is that our perception is sensitive to luminance contrast rather than the
absolute luminance values themselves. "
According to Weber's law (2, 3}, if the !uminan~ fp} of an object is just
noticeably different from the luminance Is of its surround, then their ratio is
The value of the constant has been found to be 0.02, which means that at least
• • •
50 levels are needed for the contrast on a scale of 0 to 1. Equation (3.4) says
I
. 1
Fll,'lU'C 3.3 Simultaneous contrast: (a)
j ,,I
small squares in the middle have equal
luminances but do not appear equally .
- .,. ,I •
bright;
•
I .
• • ,
I
! .. .
(b) small squares [n the middle appear
1 almost equally bright, but their Iurni-
nances are different.
• 3 Background ratio
_IUn+ 100)
C- fn+f
Is = background luminance
, The luminance f lies in the interval [0, 100] except in the
logarithmic law. Contrast scale is over [0, 100].
The choice of n = 3 has been preferred over the logarithmic law in an image coding
study [7]. However, the logarithmic law remains the most widely used choice.
•
100
50 Iog,o f _ _ ~
80
u 60
J 40
I
I
I I
20
•
•
'0 20 40 60 80 100
Luminance, f
\
•
Figure 3;4 Contrast models.
"
•
•
-., luminance
•
i\
-
"..
•
e
(.?
,
~'lo ... Brightness
•
~---~
r---!.. I ,
1 " ' - - - """"'"
...---4,- -- ......~
'--"'--~'
'=1"'''''-_:":_:-:.=l-J i
•
•
Distance
,
(b) Luminance verlus brightness.. Figure 3.5 Mach band effect.
•
I
•
/ ......., Brightness
I
...... _-------
I Lumtnance
I
I
l
D 8 -- .. ' ..
lal D - dark band, B - bright band. D FJ
Distance
1.0
0.75
0.50
• 0.25
•
3.3 MTF OF THE VISUAL SY&TfM
• •
The Mach band effect measures the response of the visual system in spatial COQr~:..
Qinates. The Fourier transform of the impulse response gives the frequency re-
sponse of the system from which its MTF can be determined. A direct measurement
of the MTF is possible by considering a sinusoidal grating of varying contrast (ratio·
. of the maximum to minimum intensity) and spatial frequency (Fig. 3.7a). Obser-
vation of this figure (at a distance of about 1 m) shows the thresholds of visibility at
54 • •
Image Perception Chap. 3
•
•
-;g.
OJ
50
•
.-,~1:: 40
.-
~
c:
"
~
il 20
'8"
e
10
o 0.5 1 5 10 50.
(a)
Spatial frequency. cycles/degree
'(bl
Figure 3.7 MTF of the human visual system. (a) Contrast versus spatial frequency
sinusoidal grating; (b) typical MTF plot.
various frequencies. The curve representing these thresholds is also the MTF, and it
varies with the viewer as well as the viewing
•
angle. Its typical shape is of the form
shewn in Fig. 3.7b. The curve actually observed from Fig. 3.7a is your own MTF
(distorted by the printing process). The shape of the curve is similar to a band-pass
filter and suggests that the human visual system is most sensitive to midfrequencies
and least sensitive to high frequencies. The frequency at which the peak occurs
varies with the viewer and generally lies between 3 and 10 cycles/degree. In pr.?ctice,
the contrast sensitivity also depends· on-.1he~orjentation_of the grating, being
maximum for horizontal and vertical gratings. However, the angular sensitivity
variations are within 3 dB (maximum deviation is at 45°) and, to a first approxi-
mation, the MTF can be considered to beiaotropic and the phase effects can be .
ignored. A curve fitting procedure [6] has yielded a formula for the frequency
, response of the visual system as
f .
•
0:
where A, Q, 13, and Po are constants. For ~ 0 and 13 ~ 1, Pu is the frequency at •
which the peak occurs. For example, in an image coding application [6], the values
A = 2.6, Cl = 0.0192, Po = (O,1l4yl ~ 8.772, and 13 ~ 1.1 have been found useful.
.The peak frequency is 8 cycles/degree and the peak value is normalized to unitY,
" .
3.4 THE VISIBILITY FUNCTION
u (m,n ) + .
u ( m.n )
,
•
+ . •
•
+ ! q(m, n l
.
Noise
source Figure 3.8 Visibility function noise
source model. The filter impulse response
•
hem, n) determines the masking func-
him, n} tion. Noise source output depends on the
elm, n} masking function amplitude 14
1. e(m,n)=u(m,n)-u(m -1,n)
2. e(m, n) = utm, n) - aj u(m - 1, n) - Qzu(m, n -1) + QJu(m -1,n - 1)
3. e(m,n)=u(in;n)-a[u(m-1,n)+u(m,J-l,n)
+ uim, n - 1) + u (m, n + I)J
The ~bility function me~sures th~l.IbJective visibility--in li-!cene containing
this masking fimction depemfent noise q (m, n). It is measured as follows. Fora
suitably smallaz and a fixed interval [x, x + !:u], add white noise of power P, to all
those pixels in the original image where masking function magnitude lei lies in this
interval. Then obtain another image by adding white noise of power P", to all the
pixels such that the two images are subjectively equivalent based on a subjective
scale rating, such as the one shown in Table 3.3. Then the visibility function v(x) is
defined as [4] ... . .
V(x)=~
r;
,
+---_f +-_.....::::::::: - p
,
, Luminance C(X, vI Brightness
g('), yields the contrastc(x, y). The lateral inhibition phenomenon is represented
oy a spatially mvariant, isotropic, linear system whose frequency response is
H(~I ,~). Its output is the neural signal, which.represents the apparent brightness
b (x, y). For an optically well-corrected eye, the low-pass filter has a much slower
drop-off with increasing frequency than that of the lateral inhibition mechanism.
Thus the optical effects of the eye could be ignored, and the simpler model showing
the transformation between the luminance and the brightness suffices.
Results from experiments using sinusoidal gratings indicate that spatial fre-
quency components, separated by about an octave, can be detected independently
by observers. Thus, it has been proposed (7] that the visual system contains a
.nurnber of independent spatial channels, each tuned to a different spatial frequency
and orientation angle. This yields a refined model, which is useful in the analysis
and evaluation of image. processing systems that are far from the optimum and
introduce large levels of distortions.Tor near-optimum systems, where the output
image is only slightly degraded, the simplified model in
, . Fig. 3,9 is adequate and is ,
• the one
. with which we shall mostly
'
be concerned.,
.Image fidelity criteria are useful for measuringimage quality and for ratingt.he
performance of a processing technique or a vision system. There ~re tvvo.tYl'esof
_~ritetiaJhatar~ used for evaluati . ze quality, subjective and quantit?Jjv~
The subjective criteria use rating scales such as goodness scales and impairment
scales. A goodness scale may be a global scale or a group scale (Table 3.2). The
overall goodness criterion rates image quality on a scale ranging from excellent to
unsatisfactory. A training set of images is used to calibrate such a scale. The group
goodness scale is based on comparisons within a set of images.
The impairment scale (Table 3.3) rates an image on the basis of the level of
degradation present in an image when compared with an ideal image. It is useful
in applications such as image coding, where the encoding process introduces
degradations in the output image.
Sometimes a method called bubble sort is used in rating images. Two-images A
and B from a group are compared and their order is determined (say it is AB). Then
the third image is compared with B and the order ABC or ACB is established. If the
order is ACE, then A and C are compared and the new order is established. In this.
way, the best image bubbles to the top if no ties are allowed. Numerical rating may
be given after the images have been ranked. .
If several observers are used in the evaluation process, then the mean rating is •
given by
n
2:= Sknk
k I .
R=~"~-
2:
k=l
nk
.
where Sk is the score associated with the kth rating, n« is the number of observers
with this rating, and n is the number of grades in the scale.
•
58 Image Perception Chal'.3
Among the quantitative measures, a class of criteria used often is called the
mean square criterion. It refers to some sort of average or sum (or integra!) of
squares of the error between two images. For M x N images u(m, n) and u 'em, n),
(or v (x, y) and v'(x, y) in the continuous case), the quantity
O"t~
1
L: L:
/Ii N.
iu (m , n) - u ' (m, n)1 2
or
Jr Iv(x,y)-v'(x,Y)1 dx dy "
J
2
(3.9)
MN m~)n=l ' J l '
or fl E[iv(x,y)-v'(x,y)1
\ll. "
21dxdy
(3.11)
called the average mean square or integral mean square error, is also used many
times. In many applications the (mean square) error is expressed in terms of a
signal-to-noise ratio (SNR), Which is defined in decibels (dB) as
~ 0" •
SNR = 10 logIY2" (3.12)
-,_ "'tr e
----------- ;
where 0"2 is the variance of the desired (or original) image.
Another definition of SNR, used commonly in image coding applications, Is
S
'.
~N
'-~
R ' :: 1-0"-'I - - (peak-to-peak valueoflne reference Image):!
oglO 2 (3.13)
"_ -- - "_._
.
0- _ _ ••• __
0" e =----~-
0'1.= If ~ec
/b(x,y)-b'(x,y)i2dxdy
_
,, (3.14)
' . I
where B(~I'~) is the Fo.~rier transform of b(x, y) and (3.15) follows by virtue of the
Parseval theorem. From Fig. 3.9 we now obtain . '
•
3.7 COLOR REPRESENTATION
The study of color is important in the design and development of color vision
systems. Use of color in image displays is not only more pleasing, but it also enables
us to receive more visual information. lYhiIe we can ~.l~ive only a few dozen gray
_~v~ls,we have the ability todistinguis~between thous~nds_of coiors:Tlie percep-
tual attributes of color are brightness, hue, and saturation. Bri1Wt.r~~
tl1e-perceiveJlluminance-aLmentioned"bclm:e.:'T:,he hpe of 3..<,:010r refers toj~
"redness," "greenness," ap.d SQ un. For monochromatic light sources, differences in
White
-- • I ,
Pure (spectral)
Hue colors
G-
\
+--Saturation ,
-,
8
I
_ Brightness
Figure 3.10 Perceptual representation
of the color space. The bnghtness W'
varies along the vertica] axis. hue a varies .
along the circumference, and saturation S
varies along tile radius.
•
-
100
$,(l\)
so
•
1.l 40 f s, (l\)CIl\)dl\ '" IC)
Ii
15
*- 20 I
,
1 C(I\} f S2(A1COddl\ "2 IC )
OL-_.l-_-I.._......L---;......J_--l _
4Oil' 450 500 550 ,600 650 •
Wavelength Inml
,
lal Typical sensitivity curves for S" S2' S3
(not to scale). ' {b) Three receptor mode' for color representation
" Figure 3.11 (a) Typical absorption spectra of the three types of cones in the
human retina; (b) three-receptor model for color representation.
/"
bues. are manifested by the differences in wavelengths., Saturation is that
.,,-,",
aspect of
perception that varies most strongly as more and more white light is added to a
monochromatic light. These definitions are somewhat imprecise because hue, satur-
.ation, and brightness all change when either the wavelength, the intensity, the hue,
-,or the amount of white light in a color is changed. Figure 3.10 shows a perceptual
! representation of the color space. Brightness (W*) varies along the vertical axis,
hue (6) varies along the circumference, and saturation (S) varies along the radial
•
distance.' For a fixed brightness W*, the symbols R, G, and B show the relative -
locations of the red, green, and blue spectral colors. ,
Color representation is based on the classical theory of Thomas Young (1802)
[8], who stated that any color can be reproduced by mixing an appropriate set of
thre,~,prill1ary colors. Subsequent findings, starting from those of Maxwell [9J to
'more recent ones reported in [10, Ill, have established that tl1er~ ar~bree_differ~nt
types of cones in the (n,ormal) human retinawith.absorpsion spectra Sl(A)...Sz(A),
. S3(A), where Am;":S A:S Amax ,Ami, = 380- nm, Amax = 780 .nrn, These responses
ana
peak in the yellow-green, green, and blue regions, respectively, of the visible •
, , ~ .. '.
'The superscript' used for brightness should not be confused with the complex conjugate. The notation
used here is to remain consistent with the commonly used symbols for color coordinates.
(3.20)
where the limits of integration are assumed to be Amill and Amax and the sources 1,re
~nearl[i!!~~£~':l2~nJ, i.e. ,"a linear combination of any two sources cannotproduce'
tfie third source. To match a color C(lI.), suppos~ the three primaries are mixed in
proportions of (3k,k = 1,2,3 (Fig. 3.12). Then 2. (3k~(lI.) should be perceived as
. k- 1 •
C( lI.,l.e.,
) -
3 l r 3
2: I3kPk(lI.),Si(lI.)dll. = 2: 13k) s, (X,)Pk(lI.) dll.
Uj(C) =
J k~1 J k~l
i=1,2,3 . (3.21)
Defining the ith cone reSDOl1se generated by one unit of the kth primary as
i, k = 1,2,3 (3.22)
we get
J
2: (3k f1"k = ai(C) = fSi(lI.)C(x') s», 'i = 1, 2 , 3 (3.23)
k=l
These are the color matching equations. Given an arbitrary color spectral distribu-
tion .C(A), the primary sources Pk(lI.), and the ~ectral sensitivity curye~ S, (lI.), the
quantities f3k, k = 1,2,3, can be found by solving these equations. Tn practice, the
primary sources are calibrated against a reference white light source with known
P,
•
13 1 ,(
,C/:
•
•
P, 13,c/ . S,{t.) •
/ • ,
( 13k
'Ik(C)=-, k=1,2,3 (3.24)
Wk _
i = 1,2,3 (3.25)
I
for each A'. Given the spectral tristimulus values 'Ik(A), the tristimulus values of an
arbitrary color qA) can be calculated as (Problem 3.8) . '
t Example 3,L .
t
The primary sources recommended by the CIE as standard sources are three
monochromutic sources
Nt.) = 8(A - A,), AI = 700 nm, red
PleA) = 8(!I. - A2), A2 =546.1 nm, green
P3(A) = 8(A - A3), A3 = 435.8 nm, blue
Using (3.22), we obtain a" k = S, (Ak), i, k = 1,2,3. The standard cm white source has a
flat spectrum. Therefore, «i(W) = JSi(A)dA. Using these two relations in (3.23) for
reference Whit,:.: we~ail ~vrite ... -:--~:7 c..."'(fl) = 1
2:
k = 1
W.S,(Ak) = JSi(A)dA, i = 1,2,3 (3.27)
which can be solved for w. provided {Si (A.), 1 <i, k :£ 3} is a nonsingular matrix. Using
the spectra: sensitivity curves and w", one can solve (3.25) for the spectral tristimulus
values 7k(!I.) and obtain their plot as in Fig. 3.13 . .Note that some of thetrisjimulus
CC
values are negative, This means that the source with negative tristirnulus va!JJe, when
mixed with the given color, will match an appropriate mixture -- of the.ather two sources.
-_.".- .. - _.~
,It is safe to say that anyone s(:t of three primary sources cannot match all the visible
,:o'ors; J!lthough for any given caloi,..:a--:-sult;lble set of three pr:inl;iry sources can.
-be found. Hence, the primary sources for cpl()L[eproduction should be chosen to
maximize the number of colors that can be matched.
. The preceding theory of colorimetry leads to a useful set of color matching' rules
[13], which are stated next. .
, '
•
0.4
• T, (f,)
Ta(Al
0.3
~
T, (AI (
."-"rn 0.2
>
-E""
~
.-.--
0
0.1
~
•
f-
0
400 600 700
•
•
-0.1 !
Wavelength, A(nml
Figure 3.13 Spectral matching tr.stimulus curves for the erE spectral primary ,
system. The negative tristimulus v~lues indicate that the colors at those wavelengths
cannot be reproduced by the eIE primaries.
1. ,Any color can be matched by mixing at most ~hree colored lightslThis means
we can always' find three primary sources such that the matrix {al. J is non-
singular and (3.23) has a unique solution.
2. The luminance of. a color mixture is equal to the sum of the luminances of its
components. The luminance Y of a color light C(A) can be obtained via (3.2)
as (here the dependence on x, y is suppressed)
From this formula, the luminance of the kth primary source with tristimulus
setting 13k = Wk Tk (see Fig. 3.12) will be Tk WkJ Px(A)V(A) d): Hence the lumi-
nance of a color with tristimulus values 1k, k = 1, 2, 3 can also be written as
3 3
Y= 2: 1kf wkPk(A)V(A)dA A L tu, (3.29)
k=1 . k=l
•
where lk is called the 1Y.mJngsity coefjident of the l'th primary.
The reader should be cautioned that in general \
3
C(A) =F L Wk Tt,Pk(X} (3.30)
k=l
Chromaticity Diagram
The chromaticities of a color are defined as
~ ~~. __. ~.. - "~7
•
Clearly t l + t2 + t3 = 1. Hence, only two of the three chromaticity coordinates
are independent. Therefore, the chromaticity coordinates project the three-
dimensional color solid on a plane. The chromaticities il,"iz jointly represent the
•
Sec. 3.8
J
Color Matching and Reproduction 65
9
510 __ 2.0 t
530 1.5
Aererence
white
-0.5 , Line of
purples
Figure 3.14 Chromaticity diagram for the elE spectral primary system. Shaded
area is the color gamut of this system.
chrominance components (i.e., hue and saturation) of the color. The entire color
space can be represented by the coordinates (tt ,12 , Y), in which any Y = constant is
a chrominance plane. The chromaticity diagram represents the color subspace in
the chrominance plane. Figure 3.14 shows the chromaticity diagram for the CIE
spectral primary system. The chromaticity diagram has the following properties:
1. The locus of all the points representing spectral colors contains the region of
all the visible colors. .
2. The straight line joining the chromaticity coordinates of blue (360 nm) and red
(780 nm) contains the purple colors and is called the line of-purp}CJ..
3. The region bounded by the straight lines joining the coordinates (0,0), (0,1)
•
and (1,0) (the shaded region of Fig. 3.14) fontains.Jlllth~Qlor'§_J~oducible_
by the primary.sources,
- ~, . - '
This region is called the color');flltlut of the primary
.
sources. '.
•
4. The reference white of the CIE primary system has chromaticity coordinates
(t }). Colors lying close to this pointare the less saturatedcolors:-Colors
located far from this point-are t~e..J!10resatur:[email protected]$".T hus the spectral
colors and the colors on the line of purples are maximally saturated.
•
3.9 COLOR COORDINATE SYSTEMS
There are several color coordinate systems (Table 3.4); which have corne into
existence for a variety of reasons.
u, v =chromaticities
RN] [ 1.910
G = -0.985
-0.533
2.000
-0.288J[X
-0.028 Y
j'
[B N
N 0.058 -0.118 0.896 Z
- •
•
7, NTSC transmission 'system:
. Y = luminance
,
J, Q = chrorninances ,.
•
•
= brightness ,
L'
a * =500 .-
,XU
[(X)'" -(Y)'
- "']
'XI
.. ,.,. f •
a * =red-green content
•
b =200 [(-};;Y)"" - -(Z)
- ''']
Zt/ •
•
.
Sec. 3.9 Color Coordinate Systems 67
_As mentioned before, the CfE spectral.primary sources .do.notyield.a.nnj .
gamut of reproducible colors. In fact, no practical set of three primaries has been
-found that can reproduce all colors. This has led to the development of the eIE
X, Y, Z system with hypothetical primary sources such that all the spectral tri- .
stimulus values are positive. Although the primary sources are physically unreal-
izable, this is a convenient coordinate system for colormetric calculations. In this
system Y represents the luminance of the color. The X, Y, Z coordinates are related
to the eIE R, G, B system via the linear transformation shown in Table 3.4. Figure
3.15 shows the chromaticity diagram for this system. The reference white for this
system has a flat spectrum as in the R, G, B system. Ihe tristimulus values foUh.!<.
.reference white are X... = Y = Z ='--_., 1.
-~. ~~ ~ ~.' ~ ".~" .._.. - -- --
Figure 3.15 also contains several ellipses of different sizes and orientations.
These ellipses, also called MlLc;Ada.m. elLgzse~ [10, 11]; ~e sueh
lie inside are indistil1~uishable. Any color lying just outside the ellipse is just.
-noticeably different (JND)-from the color at the center of the ellipse. The size,
orientation, and eccentricity (ratio of major to minor area) of these ellipses vary
throughout the color space. Tue uniformchromaticity scale (UeS) system u, v, Y
transforms these elliptical contours with large eccentricity (up to 20: 1) to near
circles (eccentricity = 2 : 1) of almost equal size in the u, v plane. It is related to the
X, Y, Z system via the transformation shown in Table 3.4. Note that x; y and u, v are
the chromaticity coordinates and Y is the luminance. Figure 3.16 shows, the
chromaticity diagram of the ues coordinate system. The tristimulus coordinates
corresponding to u, v, and IV ~ 1·"- u - v are labeled as U, V, and W respectively.
>!--
The U·, V·, W* systS:ffi is a modified ues system whose origin Clio, va) is. .
-,.,~.... ,
,l:'shifted to the reference white in the u, v chromaticity plane. The coordinate tv"
is a cube root transformation of the luminance and represents the contrast. (or
y
520
0.8 r-r:
540
0.6
soo
~ \ f!)
,
0.4 ,,
600
~ 6i'
f!(J @ I@
•
0.2 ~ f). @
480 (b f)@ tJ9........
Figure
, 3.15 . Chromaticity. diagram for
380 . the CIE Xy:Z color coordinate system.
The (MacAdam) ellipses are the just no-
I • "
o 0.2 0.4 0.6 ticeable color difference ellipses.
•
0.2 •
460
0.1 440
i"oJ'400 Figure 3.16 Chromaticity diagram for
0.1 . 0.3 0.4 the CIE ues color coordinate system.
•
brightness) of a uniform color patch. This coordinate system is useful for measuring
color differences quantitatively. In this system, for unsaturated colors, i.e., for
colors lying near the grays in the color solid, the difference between two colors is, to
a good approximation, proportional to the length of the straight line joining them.
. The S, 0, W* system is simply the ~.!lr representation of the U*, V*, W*
~, where Sand e represent, respectively, the saturation and hue attributes of
color (Fig. 3.10). Large values of S imply highly saturated colors.
The National Television Systems Committee (NTSC) fllfgjver primary systgm=-
(R N , GN , BN ) was developed as a standard for television receivers. The NTSChas
adopted three phosphor primaries that glow in the red, green, and blue regions of
the visible spectrum. The reference white was chosen as the illuminant C, for which
the tristimulus values are R N = GN = BN = 1. Table 3.5 gives the NTSC coordinates
of some of the major colors. The color solid for this coordinate system is a cube
(Fig. 3.17). The chromaticity diagram for this system is shown in Fig. 3.18. Note that \
the reference white for NTSC is different from that for the CIE system. .
The NTSC transmission system (Y, I, Q) lVas developed to facilitate. trans-
mission of color images using the existing monochrome television.channels without
Increasing the bandwidth requirement. The Y coordinate is the luminance (mono-
chrome channel) of the color. The other two tristimulus signals, I and Q, jointly
represent hue and saturation of the color and whose bandwidths are much smaller
•
than that of the luminance signal. ~j,J2. components are transmitted on a
~ -,._..-- cha[1nel. using quadrature modulation in such
subcarrier - --- ----- .. -_.
a -way that the spatiaJ.,
• •
•
•
Tristimulus and Chromaticity Values of Major Colors in the NTSC
Receiver Primary System
Cyan r---H-L('
"=-+----;1-::-..,---- FIN
Black./0 480
Red
BOO
/ Blue Magenta
360
BN Line of purples
Figure 3.17' Tristimulus color solid for the Figure 3.18 Chromaticity diagram for the
NTSC receiver primary system. l'o'TSC receive I primary system.
spectra of I, Q do not overlap with that of Y and the overall bandwidth required for
.transmission remains unchanged (see Chapter 4). The Y, I, Q system is related to
the R N , ON, B N system via a linear transformation. This and some other trans- .
formations relating the different coordinate systems are given in Table 3.6.
The L *, a*, b* system gives a quantitative expression for the Munsell system.
of color classification [121. Like the U*, V·, W' system, this also gives a useful color
.difference formula.
Example 3.2
We will find the representation of the NTSC receiver primary yellow in the various
coordinate systems. From Table 3.5. we have R" = 1.0, Gf<i = 1.0,B" = 0.0.
Using Table 3.6, we obtain the eIE spectral primary system coordinates
as R == 1.167 - 0.146 - 0.0 "" 1.021, G "" 0.114 -t- 0.753 + 0.0 = 0.867, B = -0.001 +
0.59 + 0.0 = 0.058.
The corresponding.chromaticity values
•
are
Output
vector Transformation matrix Comments
•
c., measure the average human perception sensitivity due to small differences in the
ith and in the jth coordinates. ' .
Smalldifferences in color are described oil observations of just noticeable
differences (JNDs) in colors. A unit JND defined by
'3 3
1= L L c'.jdj(dXj
i ..~lj=l
(3.33)
•
is the describing equation for an ellipsoid. If the coefficients c'.j were constant
throughout the color space, then the JND ellipsoids would be of uniform size in
the color space. In that event, the color space could be reduced to a Euclidean
tristimulus space, where the color difference between any two colors would become
proportional to the length of the straight line joining them. Unfortunately, the
Ci,j exhibit large variations with tristimulus values, so that the sizes as well as
the orientations of the JND ellipsoids vary considerably. Consequently, the distance
between two arbitrary colors C j and Cz is given by the minimal distance chain of
ellipsoids lying along a curve <9" joining Cj and Cz such that the distance integral
A l C, (Xi)
uc; Cz) = ([1), C I (Xi)
ds (3.34)
is minimum when evaluated along this curve, i.e., for <9 = ~*. This curve is called
the geodesic between C j and Cz. If ci,j are constantin the tristimulus space, then the
geodesic is a straight line. Geodesics in color space can be determined by employing .
a suitable optimization technique such as dynamic programming or the calculus of
v
,
0,4 Green •
:;;::- ... Yellow
~~
0.3
Cyan
•
,
,
Blue
O. 1
, .
variations [15J. Figure 3.19 shows the projections of several geodesic curves
between the major NTSC colors on the DCS 1.1, v chromaticity plane. The geodesics
between the primary colors are nearly straight lines (in the chromaticity plane), but
the geodesics between most other colors are generally curved.
Due to the large complexity of the foregoing procedure of determining color
distance, simpler measures that can easily be used are desired. Several simple
formulas that approximate the Riemannian color space by a Euclidean color space
have been proposed by the CIE (Table 3.7). The first of these formulas [eq. (3.35)J
was adopted by the CIE in 1964. The formula of (3.36), called the eIE 1976 \
L', 1.1*, v* formula, is an improvement over the 1964 CIE U', V', W*formula in \
regard to uniform spacing of colors that exhibit differences in sizes typical of those
in the Munsell book ofcolor [12] .
. The third formula, (3.37), is called the CIE 1976 L *, a", b" color-difference
formula. It is intended to yield perceptually uniform spacing of colors that exhibit
color differences greater than JND threshold but smaller than those in the Munsell
book of color.
•
'3.11 COLOR VISION MODEL
•
Tk~akt'Y,C~,
•
k:=1,2,3 (3.38)
ak x, y, W
)
Sec. 3.11 Color Vision Model 13
•
R{oJ / X.YI n1, X, ¥I
-,
T C1 (X, JI I
Log H, (." <,I ' 8, lx, Vi
GN!x. yl T"lx, yl -
T2 • , C2 (x, vI
A
•
Log ,
B If,!.,. <') B,(x, yt
BN Ix. yl T;(x.y) -
Ta C, lx, yl
, , Log , If, (.,. <,I
•
....
Figure 3.20 A color vision model.
In analogy with the definition of tristimulus values Tkfire called the retina{
~tristimuluscoordjnates (see Problem 3.14>:..,The cone '::sponses undergo non-
linear point transformations to give three fields T,,(x, y), k =- 1,2,3. The 3 x 3 matrix
B transforms the {1k(x,y)} into {Ck(x;y)} such.that C1(x,y) is the monochrome
(achromatic) contrast field c(x, y), as in Fig. 3.9, and C2(x, y) and C,(x, y) represent
the corresponding chromatic fields. The spatial filters Hk(~l '~2)' k = 1,2,3, repre-
sent the frequency response of the visual system to luminance and chrominance
contrast signals, Thus Hl(~I' ~2) is the same as H(~l'~) in Fig. 3.9 and is a band-
pass filter that represents the lateral inhibition phenomenon. The visualfrequency
response to chrominance signals are not well established but are believed to have
their passbands in the lower frequency region, as shown in Fig. 3.21. The.3 x 3
matrices A and B are given as follows:
. 1
From the model of Fig. 3.20, a criterion for color image fidelity can be.
defined. For example, for two color images {R N , G N , B N} and {R N, G,v, BM, their
subjective mean square error could be defined by
(3.40)
where Wt is the region over which the image is defined (or available), A is its area,
and {Bk(x, y)} and {Bi(x, y)} are the outputs IJf the model for the two color images.
1,0
•
Temporal aspects of visual perception [1, 181 become important in the processingof
motion images and in the design of image displays for stationary images. The main
properties that will be relevant to our discussion are summarized here.
Bloch's Law
When a slowly flashing light is observed, the individual flashes are distinguishable.
At flashing rates above the critical fusion frequency (CFF), the flashes are indistin-
guishable from a steady light of the same average intensity. This frequency gener-
ally does not exceed 50 to 60 Hz, Figure 3.22 shows a typical temporal MTF.
This property is the basis of television raster scanning cameras aad displays.
Interlaced image fields are sampled and displayed at rates of 50 or 60 Hz. (The rate
is chosen to coincide; with the power-line frequency to avoid any interference.) For
digital display of still images, modem display monitors are refreshed at a rate of
60 frames/a to avoid any flicker perception.
,,"
--- .....
"
1.0
-_..... ",; "\
\
0.5 \
;:- ~ •
.-
>
~ 0.2
• '& ,.
'-.~
.!i
0.1
--- High spatial
frequencies field
~ 0.05
, Low spatial
- - frequencies field
0.02
•
•
1 2 5 10 20 •
Figure 3.22 Temporal MTFs for flicker·
Flick'" frequency {Hz} ing fields. •
•
images by subsampling the moving areas everywhere except at the edges. For the
same reason, image display monitors offering high spatial resolution display images
at a noninterlaced 60- Hz refresh rate.
• •
•
PROBLEMS
3.1 Generate two 256 x 256 8-bit images as in Fig. 3.3a, where the small squares have gray
level values of 127 and the backgrounds have the values 63 and 223. Verify the result
of Fig. 3.3a. Next change the gray level of one of the small squares until the result of
Fig. 3.3b is verified.
Show that eqs. (3.5) and (3.6) are solutions of a modified Weber law: dflf " is pro-
portional to de, i.e., equal changes in contrast are induced by equal amounts of dflf'.
•
Find '{.
3.3 Generate a digital bar chart as shown in Fig. 3.5a, where each bar is 64 pixels wide.
Each image line is a staircase function, as shown in Fig. 3.5b. Plot the brightness
function (approximately) as you perceive it.
3.4 Generate a 512 x 512 image, each row of which is a smooth ramp r(n). as shown in
Fig. P3.4. Display on a video monitor and locate the dark (D) and the bright (B) Mach
bands. .
r(n)
• •
'.
225 ---------------- • I •
I I
I
I I
I I
I I
I I
I I
I I
I I
I I
135 --------- I I
120 I I
I I ,
I I
n
1 100 200 280 300 512
•
Figure P3.4
The Mach band phenomenon predicts the one-dimensional step response of the visual
system, as shown by s(n) in Fig. P3.5. The corresponding one-dimensional impulse
response (or the vertical line response) is given by h (n)- s(n) - s(n - 1). Showthat
h(n) has negative lobes (which manifest the lateral inhibition phenomenon) as shown
in Fig. 3.6c.
• •
3.6 As a rule ofthumb, the peak-to-peak value of images can be estimated as na,where n
varies between 4 to 6. Letting n'= 5 and using (3.13), show that
• •
2.7
s-; 2.5
II 'lI' - --.- -Cl- - - Step response s{n) •
I I I
I
I
I I
I
I
I I I
I
2.0 . Step input
I I I
I
I I I I
I I I
I I I 1
I I I I
I
I I
I
I
I
I
,
I
I I
1.0? I I
I I
I'
I I
I .1
I .
I
I
,
I
I
I
I
I
0.5 I I
I I
, I I I I I
I
I
I
I 'f,0.3 : I
I
I
I
I
I
I
I
I I I I I I I
I I
-3
!
-2 -1
! I
0
I
1 2
! I
• n
Figure P3.S
Can two monochromatic sources with different wavelengths be perceived to have the
same color? Explain.
Using eqs. (3.23) through (3.25), show that (3.26) is valid.
In this problem we show that any two tristimulus coordinate systems based on different
sets of primary sources are linearly related. Let {Pk(A)} and {Pk(A)}, k == 1,2,3, be two
sets of primary sources with corresponding tristimulus coordinates {L} and {Tk} and
reference white sources W(A) and W'(A). If a color C(A) is matched by these sets of
sources,then show that
•,
•
where the definitions of a's and w's follow from the text. Express this in matrix form
and write the solution for {Tn.
Show that given the chromaticities t, , /2 and the luminance Y, the tristimulus values of
a coordinate system can be obtained by
•
-. , 7' -
'k -
Ik
3
Y
.
k ==.1, 2, 3
L: li t,
i = 1
'*3.11" For all the major NTSC colors listed in Table 3.5, calculate their tristimulus values
in the RCB, XYZ, UVW, YIQ, U'V*W, L *a*b', saw*, and 'Ti t: t, coordinate
.. .... systems. Calculate their chromaticity coordinates in the first three of these systems.
,-3.12\ Among the major NTSC colors, except for white and black (see Table 3.5), which one
(a) has the maximum luminance, (b) is most~§turated,~nd
- ,.•.•.'>........... --
(c) is least saturated?
•
. 3.13* Calculate the color differences between all pairs of the major NTsc colors listed in
Table 3.5 according to the 1964 CIE formula given in Table 3.7. Which pair of colors is
(a) maximally different, (b) minimally different? Repeat the calculations using the
L"a'b" system formula given by (3.37). . .
Chap. 3 Problems 77
~[Retinal cone system; Ti, Ti, TJ] Let Pk(A), k = 1,2,3 denote the primary sources
C/ that generate the retinal cone tristimulus values. Using (3.38), (3.24) and (3.23), show
that this requires (for every x, y, C), .'
3
2: ((.(x, y, C)ai,k = a,ex, y, C) =? al.k = S(i - k) (P3.14)
k~l
•
To determine P.(A), write
. 3
BIBLIOGRAPHY
Sections 3.1-3.3
• •
For further discussion on fundamental topics in visual perception:
•
1. T. N. Cornsweet. Visual Perception. New York: Academic Press, 1971.
2, E. C. Carterette and M. P. Friedman, eds. Handbook of Perception, vol, 5. New York:
Academic Press, 1975.
3. S. Hecht. "The Visual Discrimination of Intensity and the Weber-Fechner Law."
J. Gen.
,
Physiol. 7, (1924): 241.
Section 3.4
Sections 3.5-3.6
•
For a detailed development of the monochrome vision model and related image
fidelity criteria: •
S. C. F. Hall and E. L. Hall. "A Nonlinear Model for the Spatial Characteristics of
the Human Visual System." IEEE Trans. Syst. Man. Cybem., SMC-7, 3 (March 1977):
161-170.
• • •
6. J. L. Mannes and D. J. Sakrison, "The Effects of a Visual Fidelity Criterion on the
Encoding of Images." IEEE Trans. Info. Theory IT-20, 110. 4 (July 1974): 525-536. .
7. D. J. Sakrison. "On the Role of Observer and a Distortion Measure in Image Trans-
mission." IEEE Trans. Communication·COM-25, (Nov.. 1977): 1251-1267.
Sections 3.7-3.10
8. T. Young. "On the Theory of Light and Colors." Philosophical Transactions ofthe Royal
Society of London, 92, (1802): 20-71. .
9. J. C. Maxwell. "On the Theory of Three Primary Colours." Lectures delivered in 1861.
W. D. Nevin (ed.), Sci. Papers 1,Cambridge Univ. Press, London (1890): 445-450.
10. D. L. MacAdam. Sources 01 Color Science. Cambridge, Mass.: MIT Press, 1970.
11. G. W. Wyzecki and W. S. Stiles. Color Science. New York: John Wiley, 1967.
12. Munsell Book of Color. Munsell Color Co., 2441 North Calvert St., Baltimore, Md.
13. H. G. Grassman. "Theory of Compound Colours." Philosophic Magazine 4, no. 7
(1954): 254-264. '
Section 3.11
,
For the color vision model, their applications and related biography:
16. W. Frei and B. Baxter. "Rate Distortion Coding Simulation for Color Images." IEEE
Trans. Communications COM-25, (November 1977): 1385-1392. ..
17. J. O. Limb, C. B. Rubinstein, and J. E. Thompson. "Digital Coding of Color Video
. Signals." IEEE Trans. Communications COM-25 (November 1977): 1349-1384.
Section 3.12
,
\
, '
•
, .
,
\
.. .,.." y
•
•
•
Chap. 3 Bibliography 79
•
Image Sampling
and
• •
Quantization
4.1 INTRODUCTION
The most basic requirement for computer processing of images is that the ill1l!.!ies ~~.
available in digital form, that is, as arrays of fi . . . For digi-
'tization (Fig. 4.1), the given image iS$.ff!l/.pled on a discrete grid and each sample or
pixel is quantized using ll.finlklUlmber of bits. The digitized image can then be
processed by the computer. To display a digital image, it is first converted to an
analog signal, which is scanned onto a display.
•
Image Scanning
u'(m,Il}
Digital o toA Display
computer converter
Display •
• •
80 •
,
Finite aperture
,/' ./' electronic beam
,
• ./'" , 1\ I,
!
~
,,
-- ~
,/'
/""
:;.J
,
)00-
I V ,
i Vidicon camera
I tlumination
... ,
...
t
Object/film
t
Photosensitive
target
1m- •
In the United States a standard scanning convention, has been adopted by the
RETMN. Each complete scan of the tar~t is <::~kd.a~, which •
contains 525
•
•
•
c
Detector
Source
.
•
•
• •
Lens
•
•
----- - --- --
- -- -- - - -
••
End
of even
•
field
End of odd field Figure 4.5 Interlaced scanning.
• •
u(t) = yet) + I(t)cos(2r.f'et + <1» + Q(t)sin(2r./sct + <1» (4.1)
where <I> -:- 33° and f,c is the subcarrier frequency. The quantities Yand (I, Q) are
n.c luminance and chrominance compongl!t.~, respectively, which can be obtained
by linearly transforming theR, G, and B signals (see Chapter 3). The half-power
bandwidths of Y, I, and Q are approximately 4.2 MHz, J.3 MHz, and 0.5 MHz,
respectively. The color subcarrier frequency f,c is 3.58 MHz, which is 455/J12, where
ft is the scan line frequency (I.e., 15.75 kHz for NTSC). Since Isc is an 5ddmultiple of
ftl2 as well as half the frame frequency, /,/2, the :,Jhase of the subcarrier will change
1800 from line to line and from frame toframe. Taking this into account, the NTSC
composite video signal with 2 : 1 line interlace can be represented as
u{x,y, t) = Y(x,y, t) + I(x,y, t)cos(2r.f,cx + «l)cos[r.(f,.t - fty)] (4.2)
+ Q (x , y , t )sin(2r.f,c x + «l)cos[r.(f, t - fty)]
The SECAM system uses 625 lines at 25 frames/s with 2: 1 line interlacing.
Each scan line is composed of the luminance signal yet) and one of the chro-
minanee signals fj ~ (B - Y)/2.03 or V ~ (R - Y)/1.14 alternating from line to
line. These chrominances are related to the NTSC coordinates as
- -
I = V cos 33° - U sin 33°
•
(4.3)
This avoids the quadrature demodulation and.
--
the corresponding
.... chrominance
due to phase detection errors present in the NTSC receivers. The U and V sub-
shifts -
carriers are at 4.25 and 4.41 MHz. SECAM also transmits a subcarrier for lumi-
""...." •.. f ' •.....•. ~ . >. - • >. >. >. ,
ffiV
- .
where m is the line number. Thus the phase of V changes by 180 between successive, 0
c The
lines in the same field._ _ cross talk between adjacent lines can be suppressed by
averaging them. The U, V are allowed the same bandwidths (1.3 MHz) with the
carrier located at 4.43 MHz.
requires a perfect low-pass filter. In some displays a very srnallspot size can be
•
achieved so that interpolation can be performed digitally to generate a larger array,
which contains estimates' of some of the missing 'samples in between the given
samples. This idea is used in bit-mapped computer graphics displays.
The CRT display can be. used for recording the image on a film by simply
imaging thespot through a lens onto the film (basically the same as imaging with a
camera with shutter open for at least one frame period). Other recorders, such as
microdensitometers, project a rectangular aperture of size equal to that of the
. ' image pixel so that the image field is completely filled. .
. Another type of display/recorder is called a halftone display. Such a display
can write only black or white dots. J~ymMing the dot size muchsmaller than th~
yixel size, white or black dots are dispersed pseudorandomly such that the a".er~p
number of dots per pixel area is equal to the pixel gray level. Due to spatial
l,ntegration performed by the eye, such a black and white display renders the
perception of a gray-level image. Newspapers, magazines, several printer/plotters,
graphic displays, and facsimile machines use the halftone method of display.
(4.4)
• \
I
I
I
I
I
E2
(a) Fourier transform of •
bandlimited function,
Figure' 4.6
The sampling theory can be understood easily by remembering the fact that jhe
fourier transform of an arbitrary sampled function is a scaled, periodic replication
of the Fourier transform of the original function. To see this, consider the ideal
image sampling function, which is a two-dimensional infinite array of Dirac delta
functions situated on a rectallgular gri~twith spacing i.\x, i.\y (Fig. 4.7a), that is,
00
= ~xs~YS LL
k,l=-oe
3(~1 - k~xs> ~2 -l~ys) •
(4.7)
=~xs~ys LL F(~hg)@3(~1-k~.xn~~l~ys)
k; 1 ~ -e»
(4.8)
•
•
From (4.8) the Fourier transform of the sampled image is, ~hi!!Jl:scale tactor,.!l
periodic replication of the Fourier transform of the input image on a grid whose .
spacing is (~"" ~y;) (Fig. 4.7b).
.From uniqueness of the Fourier tranform, we know that if the spectrum of the
original image could be recovered somehow from the spectrum of the sampled
image, then we would ha~e the interpolated continuous image from the sampled
------ ------+--T ,
v I
am
•
I r------ . iR,
-~~
,
\ !.
I l/t;y
• • • • • • •
~
~,.. iR,
,
I
• • •
• • •
A;r*
....L. •
0 I
I .
I
• • x
I
I
• • • • • ,
I
L ___ ____ ..II
I
61,
• • • • • • •
,I ,,
I- n",.....j I
I" ~..- -
{ol Sampling grid. lb) Sampled Image spectrum.
• • •
o
(c} Aliasing and fotdcver
frequencles {shaded _areas},
then F(~b ~z) can be recovered by a low-pass filter with frequency response
. 1
(~j, ~) E V? )
. H(~b~2) = (~U~y,)' '(4.11
0, otherwise
where fA' is any region whose boundary i! ,J( is contained within the annular ring
between the rectangles i/? 1 and iii' zshown in Fig. 4.7b. This is seen by writing
-F(s!> sz) =a H(~b ~)f;(~b
( SZ) = F(~l' ii.!) (4.12)
that is, the original continuous image can be recovered exactly by low-pass filtering
.. -j -
i(x, y) =
.
LL
m••
--~
k(m A x,mly)sinc(xl;u - m)sinc(y~y; - n)
.~ . •
(4.15)
- -
--
•
lal Sampled above Nyquist rate and reconstructed by (b) sampled below Nyquist rate and reconstructed by
ZOH; ZOH;
•
(ci lowoass filtared before subsarnplinq and recon- (d) lowpass filtered before sUbsampling and recon-
structed by ZOH; structed by FOH. •
which is equal to f(x, y) if Ax, Ay satisfy (4.10). We can summarize the preceding
results by the following theorem.
Sampling Theorem
. ,
Remarks
less than the Nyquist frequencies 2;'0 and 2~o. The sampled image spectrum is , •
:7: ee
Let the low-pass filter have a rectangular region Of.SUPRWt with cutoff frequencies at
half the sampling frequencies, that is, .-' ..
I
-2.5 :S ~, -s 2.5, -2.5:s ~:s 2.5
-. H(~,,~)= ~: otherwise
. Applying (4.12), we obtain
--~"
- -
F(~l, ~2) = O(~l -
.
2, £:, -1) + S(~, + 2. £:, + 1)
•
which gives the reconstructed image as lex:, y) = 2 cos2'1l'(2x + y). Thi£..shows that any
li'eguency component in the input image that is above (~xs/2, ~y,/2) by(A~,~) is
•
!.eproduced (or aliased) as a frequency component at (~ •.,i2 - ~s t"s/2 - A~). ..
4.3 EXTENSIONS OF SAMPLING THEORY
There are several extensions of the two-dimensional sampling theory that are of
• • • •
interest In Image processing. ,
)
Remarks I
•
This theorem states that if the random field/(x, y) is sampled above its Nyquist rate,
then a continuous random field lex, y) can be reconstructed from the sampled,
sequence such that! converges to f in the mean square sense. It can be shown that
the power spectral density. function S'(~b~) of the sampled image hex, y) is a
periodic extension of S(~b~) and is given by
00
When the image is reconstructed by an ideal low-pass filter with gain 1/(~~ys), the
reconstructed
.
image power
,
spectral density is given by
where
,
!Ie 1 (~h ~2) 'E
W(~;'~2)";' , (4.22)
0, otherwise.
The alia~ItPower(T~ is the power in the tails of the power spectrum outside !Ie,
that is,
•
which is zero if f(x, y) is bandlimited witt t:o :s: g., /2, ~yO s ~y, 12. This analysis is also
n n
t • • • •
• • , • • 2 •
• A,
• • • •
~·1 •
m' m
-1 0 1 2 -2 -1 0 1 2
•, • • • • :.., • •
•
•
•
•
(a) Spectrum. (bl Rectangular grid G,. Cel Intarlaead grid G•. ,
\
, -v' ~
Hexagonal Sampling
•
, .
For functions that are circularly symmetric and/or bandlimited over a circular
region, it can be shown that sampling on a hexagonal lattice requires 13.4 percent
fewer samples than rectangular sampling. Alternatively, for the same sampling rate'
less aliasing is obtained on a hexagonal lattice than a rectangular lattice, Details are
available in [14]. .
Optimal Sampling
Equation (4.16) provides the interpretation that the sampling process transforms a
continuous function [i», y) into a sequence [tm S:», nAy) from which the original
function can be recovered. Therefore, the coefficients of any convergent series
expansion of/ex, y) can be considered to give a generalized form of sampling. Suell
sampling is not restricted to bandlimited functions, For bandlimited functions the
sine functions are optimal for recovering the original function ft», y) from the
samples f(mAx, nll.y). For bandlimited random fields, the reconstructed random
field converges to the original in the mean square sense. .
More generally, there are functions that are optimal in the sense that they
sample a random image to give a finite sequence such that the mean 'square error
between the original and the reconstructed images is minimized. In particular, a
series expansion of special interest is
'"
f(x, y) = :EL: am.n<Pm.n{X, y)
m,n.::. 0:
(4-,24)
where {<PmAx, y)} are the eigenfunctions of the autocorrelation function of the
random field f(x, y). This is called the Karhunen-Loeve (KL) series expansion of
the random field. This expansion is such that am.• are orthogonal random variables,
•
92 Image Sampling and Quantization . Chap. 4
•
and, for a given number of terms, the mean square error in the reconstructed image
is minimum among all possible sampling functions. This property is useful in devel-
oping data compression techniques for images. "
The main difficulty in utilizing the preceding result for optimal sampling of
practical (finite size) images is in generating the coefficients am,n' In conventional
sampling (via the sine functions), the coefficients am,n are simply the valuesf(m~x,
nAY), which are easy to obtain. Nevertheless, the theory of KL expansion is useful
in determining bounds on performance and serves as an important guide in the
design of many image processing algorithms.
A practical sampling system gives an output gs(x, y), which can be modeled 'as (see
Fig. 4.10) ". /'
g(x, y) ~PsCx, y)*!(x , y) = p.(-x, -y)0f(x, y)
•
'.
(4.25)
= If A
Ps(x'-x,y'-y)!(x',YJdx'dy'
r------------------, •
1 ' 1
Input I I
Image I ' Scanning g(x, y) Ideal g,(x, yl Display ii(x, y)
aperture sampler spot
•
i PsI-x, -v) , Ax, Ay I p.l-x, -V)
I I ,
I . I
L_~~---------------_-~
Practical scanner model •
Figure 4.10 Practical sampling and reconstruction. In the ideal case P.(x, y) =
P.(x, y) - 8(x, y)..
J
Sec. 4.4 Practical Limitations in Sampling and Reconstruction 93
•
which is simply the integral of the image over the scanner aperture at position (x, y).
1n general (:J .2~lrtlpre~~ a 10w-paJisfiltering opetation w!wsetfansrer fun.cti()~
determined by the aperture function Ps(x ,y). The overall effect on the recon-
. structed image is a loss of resolutionand a decrease in aliasing error (Fig. 4.11). This
effect is also visible in the images of Fig.4.12.
.
Input image Resolution
spectrum loss
•
,
Lowpass filtered
image spectrum
due to finite
aperture scanning .-,.-"'" Aliased .frequencies
._ without lowpass,
filtering
,
~s
--2 ~s
-2
Reduction in
l
Aliasing with
aliasing power Iowpass filtering
due to aperture due to aperture •
•
•
Figure 4.11 Effect of aperture scanning.
•
Figure 4.12 Comparison between zero- and
first-order hold interpolators. Zero-order
bold gives higher resolution and first-order
hold gives greater smoothing•
,"
•
-
•
,
•
•
•
Ilil 256" 256 image interpolated to 512 X 512
,. ~ - by zero-order hold IZOH I. .
••
• •
r
•
•
•
• '\ ":"" -
• •
•
'?'
,• • •'.
I .
Ibl 256 x 256 image interpolated to 512 X 512
" ; f •
by first-order hold (FOHI. .,
1- "
I
•
•
•
Icl256 x 256 images after interpolation
(il iii)
liiil i (Iv) • (iJ 128 x }28 image zoomed by
•
ZOH; liil64 x 64 image loomed by ZOH; (iii)
128 x 128 image zoomed by FOH; liv)
64 x 64 image zoomed bV FOH.
R"",angle 1 1.0
~ [zero-order hold) La 1 ( x ) P (xiI' (Yi 'inc ( ~,
) sine (.~) /h
ZOH "root " • o 2<.. 2<yO
p.lx) . x x x, ~ 'f:;f'P--
• -~
.2 II Ax
2 -+j 4<'0 I-*-
.. . -----;-~-----+----
•
.... - t _ - - - -• _ 1
Triangle ..!....!.. trl ( ..3...) 1.0
(first-order hold)
FOH •
Ax Ax Ax
P, (x)p, (y)
[ (..)
sine 2"
(~]2
sine 2'
,"'\'_
p,lxl <•• Eyo ._
. x ~
. l'r°+,
. Sine 1 SInC
4x . tLx(X) 1.
Ax4Y SI~C. ( AX x).smc (x)
.6.V~ rect (2~~~-
~l) recI (.2
2~yo )
--"'r.1--'-lt~_
... """...j . H .
2Ax 2.'0
• •
•
•
,
Figure 4.13 Image interpolation functions; 1;,0 ~-,-! AI.
, ~YO; 211y
• •
•
Reconstruction filter or
~--t- ..... display spot spectrum
,
0, otherwise
•
Suppose an image/ex ,y) ~ 2 'cos 2r.(xl4a + yI8a). with a = A, is scanned. Using (4.25)
and taking the Fourier transform, we obtain , ,
I' \
,,
" ( . ) • 2 a~I)' 2 a~2
._ .' G~" ~2 = SInC 2 sine 2 F(~" ~2)
• J
.:' : ~nc2(~)sin~!)~t.)F(~" ~2) a! O.94F(~d2) ,
•
wher~ we have used F(~" ~2) :... ~(~l ':"1;412, ~2 -1/8a) + li(s, + 1I4a, ~ +lISa). .The
scanner output signal can be written as
x ,
gs(x ,y) = g(x, y)w(x ,y) 2:2:
m, r,--~
!l(x - m S, y - nA)
•
1
Sec. 4.4 Practical Limitations in Sampling and Reconstruction 97
•
•
. .
~j. ~~.~",=,,""r"-_ ~,
64
2568
where G(st, £2) ~ G (St, ~)@W(£t, £2), 'W(St, ~) ~ L"sinc(£1 L )sinc(S2 L). This gives
.
~
G(et,~) = 61 ,440a
2[sinc(256aSl ~ 64)sinc(256as2 - 32)
Figure 4.15 shows G (eI, e,) at £2 = V8 a, for SI > O. Thus, instead of obtaining a delta
function at SI = V.a, a sine function with main-lobe width of Vl2sa is obtained. This
---
degradation of G(SI,~) due to convolution with W(st, (2) is called ripple. The associ-
ated energy (in the frequency domain) leaked into the side lobes of the sine functions
due to this convolution is known as leakage.
•
Lagrange Interpolation
The zero- and first-order holds also belong to a class of polynomial interpolation
functions called Lagrange polynomials. The Lagrange polynomial of order (q -1)
is defined as
(x
a -m)
. L%(x) = Jt k-m '
kl
,
m4k
(4.28)
Lk(x) ~ 1, 'r/k
where k o ~ -(q -1)/2, k, = (q -1)f2 for q odd and k o ~-(q - 2)f2, k, ~ q/2 for
q even. For a one-dimensional sampled sequence f(m a), with sampling interval a,
the interpolated function between given samples is defined as
"k,
J(x)=!(inll+Ci.a)~ 2: L%(Ci.)f(ma+kll) .
,.
(4.29)
" k=ko--'-
~~
~~~---
\ ~~
where ql and qz refer to the Lagrange polynomial orders in the x and y directions,
re~ecti~y. . .
u·
•
Qupnti:zer
• rL output
-
Ouantizer f .. u• -I- -
-
r.
I
I
I
I I /1 /
I I I I
I I / / /
I I I
u
I
t, / t2
I
/ /
/ / / t. /
I I
I
t L + '\
I I
I
~
- • -
. -
r, _- -
\
Ouantlzer
- • error
-
"- I-
Example 4.3
The simplest a lL\:l.!lillU~..m<:. JlJl.ifmJlU!!l~nti.~ Let the output of
• - . an image sensor take values between 0.0 to 10.0. If the samples are quantized uni-
formly to 256 levels, then the transition and reconstruction levels are
10(k -1)
Ik '= 256 ' .k '= 1, ... , 257
•
5
r. ;= IN + 256 ' k "" 1, ... , 256
The interval q ~ IN -' 1.- 1 '= T.:" r.-t is constant for different values of k and is called
the quantization interval.
•
- -
In this chapter we will consider only ~ero m~WQr)! qum:W.w:s., which operate on one
input sample at a time, and the output value depends only on that input. Such
quantizers are useful in image coding techniques such as pulse code modulation
(PCM), differential PCM, transform coding, and so on. j"lote thi. .~
.~in$ is, irreversible; that is, for a given quantizer output, the input value cannot
be determined uniquely. Hence, ;...quantizer introduces distortiQ!h which any rea-
sonable design method must attempt to minimize. There are several quanrizer
designs available that offer various trade-offs between simplicity and performance.
These are discussed next.
This:jllon:izer minimizes the mean square error for a given number of quantization
,l~~els. Let u be a real scalar random variable with a continuous probability density
function PuC 1/ ). It is desired to find the decision levels t, and the reconstruction levels
r,; for an L -level quantizer such that the meansquare errOL
.
.
- fl' "
-~'-
-
8= E[(u -U·)2) = r+' ("_I/.)2 p,,(1l) dtt
I,
I' (4.32)
I.: =
L
:L itt +1
(II - ry PuCtl ) eLI (4.33)
i"" 1 ti
(4.35)
•
where. '7k is the kth interval [tk , tk + d. These results state that the optimum transition
levels lie halfway between the optimum reconstruction levels, which, in turn, lie at
the center of mass of the probability density in between the transition levels.
Together, (4.34) and (4.35) are nonlinear equations that have to be solved simulta-
neously given the boundary values t 1 and h+l' In practice, these equations can be
solved by an iterative scheme such as the Newton method.
W_hJm1he number of quantization levels is large, an approximate solution can •
.~-
~ '
- -,
~ c,'
-. " '"- .'
where A:': tL+li:. t l and Zk = (kiL )A, k = 1, ... , L. Ihis method rl:.qll-ir~slh,!Uhe .
quantities t l and tl,+! , also called the overloa.d points; .befinite. These values, which
determine the ,dynami£Lange A onne quantlzcr, have to be assumed prior to the
placement of the decision and reconstruction levels. Once the transition levels {tk}
have been determined, the reconstruction levels {rk} can be determined easily by
averaging tk and tHI '. The quantizer mean square distortion is obtained as
r» -
1 It' ., rp..(tI.)1
' 113 d«
Is
I (4.38)
'" - 12L 1 " '''. J
This is a useful formula because it gives an estimate of quantizer error directly in
terms of the probability density and the number of quantization levels. This result is
exact for piecewise constant probability densities.
Two commonly used densities for quantization of image-related data are the
'Gaussian and the Laplacian densities, which are defined as follows. .
Gaussian:
. .
,
(4.39) .
•
Laplacian:
(4.40a)
where iL and {f2 denote the mean and variance, respectively, of u. The variance of
the Laplacian density is given by
•
(4AOb)
Tables 4.1 and 4.2 (on pp. 104-111) list the design values for several Lloyd-Max
quantizers for the preceding densities. For more extensive tables see [30].
For uniform distributions, the Lloyd-Max quantizer equations become linear, giving
equal intervals between the transition levels and the reconstruction levels. This is
also called the linear
_.'""
quantizer. Let
1
pu(a) =
0, other~se
which gives
Finally,we obtain
, ti=tk-l+q, - q
rk-tk+'2 ( )
4.42
7 .
t .
Thus all transition as well as reconstruction levels are equally spaced. The quantiza-
tion error e ~ u - u: is uniformly distributed over the interval (-qI2, qI2). Hence,
the mean square error is given by
1
(J =-
J q
l2
ll2 da
q2
=12 (4.43)
q -q/2
The variance (j~ of a uniform random variable whose range is A is A 2/12. For a
uniform quantizer having B bits, we have q = AI2 B• This gives
2B
SNR = 10 log102 = 6B dB (4.44)
Thus the signal-to-noise ratio achieved by the optimum mean square quantizer for
uniform distributions is 6 dB per bit.
,
Properties of the Optimum Mean Square Quantizer • •
•
•
1. The quantizer output is an unbiased estimate ofthe input, that is~
"l." ~
EIu'J=E[u] (4.45)
2. The quantization error is orthogonal to the quantizer output, that is,
E(u -u')u'] = O. (4.46) ,
•
.. •
•
TABLE 4.1 Optimum mean square quantizers for Gaussian density with zero
mean and unity standard deviation; t_. "" -t., r_. = r•. tU 2+ 1 ~ co, -r
Levels 2 3 4 5 6 7 8
MSE .3634 .1902 .ll75 .0799 .0580 .0440 .0345
SNR(OB) 4.3964 7.2085 9.3003 10.972 12.367 13.565 14.6J6
Entropy i.een 1.5358 1.9111 2.2029 2.4428 •
2.6469 2.8248
•
k /. ,. /. T. /, T, /, T, /, T, /, T, /, r,
1 0.ססOO .7979 .6120 0.ססOO 0.0000 .4528 .3823 0.ססOO 0.000 .3177 .2803 0.ססOO 0ססoo .2451
2 1.2240 .9816 1.5104 1.2444 .7646 .6589 1.0lXll .8744 .5606 ..5006 .7561
3 1.7242 1.4469 1.8936 1.6108 Ll882 1.0500 1.3440
4 2.0334 1.7480 21520
-,,-
Levels 9 10 It 12 13 J4 15
MSE .0279 .0229 .0192 .0163 .0141 .0122 .0107
SNR (dB) 15.551 16.395 17.16.1 17.868 18.519 19.125 19.691
EnthIopy 2.9826 3.1245 3.2534 3.3716 3.4806 3.5819 3.6765
~~'-"-'--
k /, T, I, T, /, T, I, T, r, T, /, r. I, T,
•
. 1 .2218 0.ססOO 0.ססOO .11J96 .1838 0ססoo 0.ססOO .1685 .1569 0..ססOO 0.ססOO .1457 .1370 0ססoo
2 .•6813 .4437 .4048 .60'.19 .5600 .3675 .3402 .5119 .4761 .3138 .2936 .4414 04144 ',' .2739
3 1.1977 .9189 .8339 1.0579 .9657 .7525 .6944 .8769 .8127 .6384 .5960 .7506 .7031· .5549
4 '1.8656 1.4765 1.3247 1.5914 1.4359 I.l789 1.0814 1.2859 ,1.1843 .9871 .9182 1.0858 I.om .8513
5 2.2547 1.9683 2.3452 2.0593 1.6928 1.5345 1.7&32 1.6231 1.33J4 1.2768 1.4611 1.3607 J 1151
•
6 2.4259 2.1409 2.4986 2.2147 1.8647 l.7Q33 1.9388 1.7765 15463
7 • 2.5647 2.2820 2.6253 2.3439 20067
8 2.6811
__-_ _.
... ...
- Levels •
•
. _.
16 17 18 19 20 21 22
I
MSE .0095 cOOS5 .0076 .0069 .0062 .11057 .0052 •
k I. . .r. I, r, t, r, I, r, I, r, I, r. I. t,
1 0.0000 .1284 .1215 o.m 0·0000 .1148 .1093 0.0000 o.m .1038 .0992 0.0000 0.0000 .9474
2 •
.2583 .3882 .3671 .2431 .2306 .3465 .3295 .2185 .2084 .3129 .2990 .1985 .1901 .2854
3 .5226 .6569 .6203 .4910 .4655 .5845 .5553 .4405 .4198 .5267 .5029 .3996 .3824 .4795
4 .7998 .9426 .8877 .7495 .7093 .8341 .7910- .6700 .6378 ".7488 .7140 .6062 .5797 .6798
5 1.0995 1.2565 1.1785 1.0259 .9683 1.1024 1.0426 .9120 .8664 .9840 .9364 .8218 .7848 .8897
6 1.4374 1.6183 1.5080 1.3312 1.2513 1.4002 1.3187 1.I732 1.1114 1.2389 1.1756 1.0510 1.0016 1.1135
7 1.8438 2.0693 1.9060 1.6848 1.5733 1.7464 1.6340 1.4642 1.3814 1.5238 1.4399 1.3002 1.2355 1.3576
8 2.4011 .2.7328 2.4542' 2.1273 1.9638 2.1813 2.0177 1.8037 1.6906 1.8574 1.7437 1.5797 1.4949 1.6321
, 2.5501
9 2.7810 2.5037 2.8261 2.2317 2.0683 2.2791 2.1158 1.9078 1.7937 1.9553
10 2.8684 25937 2.9083 2.6349 2.3237 2.1606 2.3659
11' 2.9460 2.6738 2.9817
• -
Levels 23 24 25 26 27 28 29
MSE .ll047 .00« .0040 .0037 .0035 .0032 .00311
SNR (dB) 23.237 23.593 23.935 24.264 24.5&1 24.887 25.182
Entropy 4.2675 4.3267 4.3837 4.4384 4.4911 4.5420 4.5911
•
.
k I, r, I, r, I, r, I, r, I, r, I, t. I. t.
1 .0909 . 0.0000 o.m .0871 .0839 o.m 0.0000 .0807 .0778 0.0000 0.0000 .0751 .0726 0.0000
2 .2737 .1818 .1747 .2623 .2535 .1677 .1617 .2427 .2342 .1557 .1504 .2258 .2184 .1453
3 .4596 .3656 .3512 .4401 .4233 .3370 .3247 .4068 .3924 .3126 .3020 .3782 .3658 .2916
4 .6510 .5536 .5314 .6117 .5985 .5096 .4908 .5747 .5540 .4722 .4560 .5337 .5158
•
.4400
5 .8508 .7484 .7176 .8125 .7801 .6874 .6614 .7481 .7206 .6358 .6136 .6935 .6698 .5916
6 ' 1.0626 .9531 .9126 1.0126 .9707 .8727 .8388 .9294 .8941 .8054 .7765 ..8594 , .8293 .7480
7 1.2918. ' 1.1721 1.1199 1.2272 1.1739 1.0686 1.0254 1.1214 1.0772 .9829 .9465 1.0336 .9952 .9106
-8 1.5466 lA1I6 1.3448 1.4624 1.3949 1.2792 1.2249 1.3283 1.2732 1.I715 1.1263 1.2189 1.1729 1.0S17
9 1.8408 1.6816 1.5954 1.7283 1.6416 1.5105 1.4423 1.5561 1.4872 1.3750 1.3191 1.4193 13628 l.2640
10 2.2029 2JlOOI 1.8854 2.0426 1.9277 1.7725 • 1.6854 1.8146 1.7270 1.5995 1.5300 1.6401 1.5708 1.4615
11 2.7107 2.4058 2.2431 2.4437 2.2813 2.0829 1.9679 ' 2.1213 2.0062 1.8546 1.7668 1.8928 1.8047 1.6801
•
12 3.0156 2.7458
•
3.0479 2.7792 2.4797 2.3177 2.5141 23524 2.1578 2.0428 2.1928 2.0778 •
1.9293
13 3.0787 2.8l1l 3.1081 2.8416 2.5470 2.3856 2.5784 2.4174 2.2263
... 14 , - 3.1363 2.8709 3.1634 2.8989 2.6085
0 15 3.1893
{,II _.-
, ....
...g
<
• •
•
/
...o
l»
"
TABLE 4.2 Optimum mean square quantizers for laplacian density with zero
mean and unity variance; t-k = -t., f_. = <r«, tv" + 1 Ll. 00•
•
Levels 2 • 3 4 5 6 7 8
MSE .5lXXl .2642 .1762 .1198 .0899 .0681 .0545
SNR (dB) 3.1)103 5.7800 7.5401 9.2152 10,464 1l.669 12638
Entropy 1.100 1.3169 1.7282 1.9466 2.2071 2.3745 2.5654
-
k t. r, t. r. t, r, t, r, t, r, I, r, I, r•
1 0.0lXXl .1071 .7071 0.0lXXl 0.0lXXl .4198 .4198 0.0000 0.0000 .2998 .2998 0.0000 0.0lXXl . 2334
2 .1414 1.1269 1.8340 .L5467 . .8395 .7196 1.1393 1.0194 .5996 .5332 .8330
3 . 2.2538 1.8464 • 2.5535 2.1462 1.4391 1.2528 1.6725
4 2.8533 2.3797 3.0868
Levels 9 10 11 12 .13 14 15
MSE .0439 .0365 .0306 .0262 .0225 .0197 .0173
SNR (dB) 13.580 14.m 15.146 15.815 16.471 17.051 17.621
Enthropy 2.7011 2.8519 2.9661 3.0907 3.1893 •
3.2955 3.3822
-
k t. r, I, r. t. r. t, r, t, r, t, r. I, r,
°1 .2334 0.0000 O.lXXlO .1912 .1912 0.0lXXl 0.0lXXl .1619 .1619 0.0000 0.0000 .1404 .1405 O.lXXlO
2 .76ftJ. ,4668 .4246 .(>580 .6158 .3824 .3531 .5443 .5150 .3239 ..3024 .4643 .4428 .2809 •
3 1.4862 1.0664 .9578 1.2576' 1.1490 .8492 .7777 LOm .9396 .7062 .6555 .8467 .7959 6047
4 2.6131 1.9060 1.6774 2.0971 1.8686 1,4488 1.3109 1.6107 1.4729 1.1731 1.0801 1.3135 1.2206 .9871
5 3.3202 . 2.8043 3.5114 2.9955 2.2883 2.0305 2.4503 2.1924 1.7727 1.6133 1.9131 1.7538 1.4540
6 3.7026 3.1574 3.8645 3.3193 2.6122 2.3329 2.7527 2,4733 2.0536
7 .>
4.0264 3.4598 4.1669 3.6002 2.8931
8 '3073
- - _.- - - _.- .~
•
Levels 21 24 25 26 27 28 29'
MSE • • .0077 .0071 .0066 .0061 .0057 .0053 .0050
SNR (dB) 21.120 21.467 21.811 22.133 22.452 22.751 23.048
Entropy 3.9666 4.0277 4.0819 4.1382 4.1886 4.2408 4.2879
•
•
k I. T. /. r, t, r. I. r. I. rk I, T, r, r.
1 .0918 . 0.0000 0.0000 .0845 .0845 0.0000 0.0000 .0783 .0783 0.0000 0.0000 .0729 .0729 0.1JO(J(l
2 .2841 .1836 .1763 .2681 .2608 .1690 .1627 .2472 .2410 .1565 .1511 2294 .2240 .1458
3 ,4956 .3846 .3686 .4691 04531 .3526 .3390 .4308 04173 .3255 .3139 .3984 3868 .3023
4 .7307 .6067 .5801 .6912 .6647 .5536 .5314 .6319 .6096 .5091 .4902 .5820 .5631 .4713
5 .9952 .8547 .8152 .9392 .8997 .7757 .7429 .8539 .8212 .7101 .6825 .7830 .7554 .6549
6 1.2976 1.1356 1.0797 1.2201 Ll642 1.0231 • .9780 1.I020 1.0562 .9322 .8941 1.0051 .9670 .8559
7 1.6507 14595
•
1.3821 l.544G 1.4666 1.3046 1.2424 1.3829 1.3207 1.1803 1.1291 1.2532 1.2020 1.0780
•
8 2.0753 1.8419 1.7352 1.9264 1.8197 •
1.6285 1.5448 . 1.7068 1.6231 1.4612 1.3936 1.5341 1.4665 LJ261
9 2.6085 2.308'i 2.1598 2.3932 2.2443 2.0109 1.8980 2.0892 1.9763 1.7851 1.6960 1.8580 1.7689 . 1.6070
10 3.3281 2.9083 2.6931 2.9929 2.m6 2.4778 2.3226 2.5560 2.4009 2.1675 2.0492 2.2404 2.1221 1.9309
• II 4.4550 3.7479 3.4126 18324 3.4971 3.0774 2.8558 3.1556 2.9341 2.6343 2.471~ 2.7072 2.5467 2.3m
12 5.1621 4.5395 5.2466 4.6240 3.9169 3.5754 3.9952 3.6537 3.2339 10070 3.3068 3.0799 2.7801
13 5.3311 4.7023 5.4094 4.7806 4.0735 3.7266 _ 4.1461 .' • .3.7995 3.3797
,
..
Q
14
15
5.4877 4.8535 5.5606 4.9264 4.2193
5.6335
CD
.......
o
; ,,~
•
,
, ,
k t. r, t, r, k t, r. t. k I, r,
1 ••· 0.0000 •0326 0.0000 .0166 23 2.3936 2.4942 .8910 .9161 45 . 2.4416 2.4933
2 • .0663 .1000 .0334 .0502 24 2.6052 2.7163 .9418 .9675 46 2.5477 2.6220
3 .1348 .1695 .0673 .0844 25 2.8403 2.9644 .9938 1.0202 47 2.6593 27165
4 .2OS5 .2415 .1018 .1191 26 3.1048 3.2453 1.0471 1.0741 4S 2.7171 2.8376
5 .2787 .3159 .1368 .1544 •
27 14m3 3.5692 ' 1.1m8 1.1295 49 2.9017 2.9659
6 .3545 .3930 .1724 .1903 28 3.7604 3.9516 1.1579 1.1863 50 3.0341 3.1024
7 .4330 .4730 .2086 .2269 29 4.1850 4.4185 1.2155 1.2447 51 3.1753 3.2483
8 .5146 .5563 .2455 •
.2640 30 4.7183 5.0181 ' 1.2747 1.3047 52 3.3266 3.4049
..,~
9 .5995" .6427 .2829 .3019 31 5.4379 5.8576 1.3355 1.3664 ),j 3.4894 3.5740
10 .6879 .7330 .3211 ,3404 32 6.5647 7.2719 iJ982 ' 1.4299 54 3.6658 3.75n
•
11 .7801 .8272 .3600 .3796 33 1.4627 1.4964 55 3.8582 3.9587
12 .8764 .9257 .3995 .4195 34 1.5292 1.5629 56 4.0698 4.1809
13 .9774 1.0291 .4398 .4602 35 1.5978 1.6327 57 4.3049 . 4.4289
14 1.0834 1.13n .' .4809 .5017 ,36 1.6687 1.7048 58 4.5694 4.7099
15 1.1949 1.2522 .5228 .5440 37 1.7420 1.7793 59 4.8718 5.0338
16 1.3127 1.3731 .5655 .5871 38 1.8180 1.8566 60 5.2250 5.4162
17 1.4373 1.5014 .6091 .6311 39 1.8967 1.9368 61 5.6496 5.6831
18 1.5697 1.6379 .6536 .6761 40 1.9764 2.0200 62 6.1829 6.4227
. 19 1.7108 1.7838 .6990 .7220 41' 2.0634 2.1067 63 6.9024 7.3222
20 1.8621 1.9404 .7455 .7689 42 2.1518 2.1970 64 8.0293 8.7364
......... 21
22
,
2.0249
2.2013
2.1094
2.2931
.7929
.,8414
,8169
.8659
43
44
2.2441
2.3406
2,2913
2.3899
•
,
• •
••
Proofs
which proves (4.46). This gives an interesting model for the quantizer, as
shown in Fig: 4.18. The quantizer noise" is uncorrelated with the quantizer
output, and we can write
u = u: +" • (4.49)
(J~ = E[(u - u: )2] = E[u 2] - E[(U·)2J ' (4.50)
Since (J~ 2: 0, (4.50) implies the average power of the quantizer output is
reduced by the average power of quantizer noise. Also, the quantizer noise"
is dependent on the quantizer input u, since ' '
E[u,,] = E[u(u- u')J = E[,,2J
3. Since for any mean square quantizer (J~ = CT~f(B), (4.50) immediately yields
(4.47). •
•
•
,,
u Optimum u: + u•
mean square
quantizer > • u +
•
.., -
'I
u=u' +'1
E[u''1]=Q •
E[u'll = E['1 2 j ,
Figure 4.18 Optimum mean square quantizer and its signal dependent noise
model.
r [Pu(1I )P'3 dn
t1 ' , .
[(x) = 2a -a (4.52)
, ,
where [ -a, a] is the range of IV over which the uniform quantizer operates. If pu(tt) .
is al.1 even function, that is, Pu(1I )= PuC -tt ), we get
, ro [puett. d» )]113 •
W = flu) y u'=g(YI
--- I I
-. I
I
I
I
I I •
u.. •
W Y
,
,
• ,
u I( • ) W Uniform Y g( • ) u
compressor. , quantizer expander
• , •
•
a (I - exp( -ax/3)]
.,.---,....:-----c'c:::-:- 0S X s A
f(x) = [1 - expf -aA 13)}'
-!(-x),-A sx <0.
g(x) '"
-~ln{l-; [l-exp(-~)]}, Q:sx:sa
(4.56)'
-g(-x), -a:sx<O. I
Transformations f(x) and g (x) for other probability densities are given in Problem
4.15.
Remarks •
1. The compander design does not cause the transformed random variable w to
be uniformly distributed:
2. For large L, the mean square error estimate of the compander can be approxi-
mated as
l' __
(. =
1 'J't.'
lZU ,[pifl)J
j/3
dll
3
(4.57)
t! .
. A --~
.~~
.-
'2a=Lq
"
where q has to be determined so that the mean square error is minimized. In terms
of these parameters,
_ ", r "
s = ~1 r+ 1
(tt - ri PU(II) d« +:2 roo (ll - rd pill) d»:
1""2 fj "Q-q
d "=
---,-0 a
dq
For L = 2, the optimum uniform and the Lloyd-Max quantizers are identical, giving
a= 2 r IIpu(II) dr/.
which equals the mean value of the random variable lui. Table 4.3 gives the opti-
mum uniform quantizer design parameters for Gaussian and Laplacian probability
densities.
•
4;9 EXAMPLES, COMPARISONS, AND PRACTICAL LIMITATIONS
•
Shannon quantizer •
40
. "'" '-...
Opt. uniform with
entropy coding
t 30
'"•
-0
0::
z·
'" 20
10
where D < 0'2, is the average mean square distortion per sample. This can also be
written as /}
. ..... '-~1, ,
•
,
• receivers to maintain a constarit bit rate over communication channels. From Fig,
4.20 it is seen that the uniform quanrizer with entropy coding gives a better per-
formance than the Lloyd-Max quantizer (without entropy coding). It has been
found that the uniform quantizer is quite a good approximation of the "optimum
quantizer" based on entropy versus mean square distortion criterion, if the quanti-
zation step size is optimized with respect-to this criterion,
In practice the design of a quantizer boils down to the selection of number of
quantization levels (L) and the dynamic range (A). For a given.number of levels, a
compromise has to be struck between the quantizer resolution (ti - t;-I) and the
attainable dynamic range. These factors become particularly important when the
input signal is nonstationary or has an unknown probability' density.
•
'4.10 ANALYTIC MODELS FOR PRACTICAL QUANTIZERS [3D}
!
In image coding problems we will find it useful to have analytic expressions for the
quantizer mean square error as a function of the number of bits. Table 4.4 lists the
•distortion function models for the Lloyd-Max and optimum uniform quantizers for
Gaussian and Laplacian probability densities of unity variance. Note that the mean
of the density functions can be arbitrary, These models have the general form
feB) = a2- bB • If the input to the quantizer has a variance (J"2, then the output mean·
square error will be simply (J"2/(B). It is easy to check that the f(B) models are
monotonically decreasing, convexfunctions of B, which are properties required of
distortion versus rate functions.
From Table 4.4 we see that for equal distortion, the number of bits, x, needed
for the optimum mean square quantizer to match the performance of a B-bit
Shannon quantizer is given by
r 2B "" 2.26(rl.%3x)
tizer performs within about Y2 bit of its lower bound achieved by an infinite-
dimensional block encoder (for Gaussian distributions).
•
•
4.11 QUANTIZATION OF COMPLEX
GAUSSIAN RANDOM VARIABLES
•
•
then A and 9 are independent, where A has the Rayleigh density (see Problem
4.15b\.~m! 9 is uniformly distributed. It can be shown the minimum mean square
quantizer for z requires that (\ be uniformly quantized. Let L 1 and L z be the number .
of quantization levels for A and 6, respectively, such that L 1 L z = L (given). Let {Vk}
and {Wk} be the decision and reconstruction levels of A, respectively, if it we~e
quantized independently by its own mean square quantizer, Then the decision levels
{tk} and reconstruction levels irk} of A for the optimum mean square reconstruction
of z are given by [31]
,,
(4.63)
• f 1)
rk"" Wk smc~L~
If L 2 is large, then sinc(1IL+-o> 1, which means the amplitude and phase variables
can be quantized independently. For a given L, the optimum allocation of L 1 and L~
requires that for rates log, L ~ 4.6 bits, the phase should be allocated approximately
1.37 bits more than the amplitude [32].
The performance of the joint amplitude-phase quantizer is found to be Quly
marginally better than that of the independent mean square quantizers for x and y,
However, the preceding results are useful when one is required to digitize the .
amplitude and phase variables, as in certain coherent imaging applications where
amplitude and phase measurements are made directly.
Lum mance
u •
f( " )
luminance to
c MMSE c•
,
.
I r: , . ,\
contrast to
I ,
u·
,quantizer
contrast -Iuminanee
,
•
Figure 4.21 Contrast quantization.
same value, re i9ns of constant gray levels are formed, whose boundaries .i!L~ called
contours (see Fig. 4.23a). rutorrn quantization of common images, where the
pixels represent the luminance function, requires about 256 gray levels, or 8 bits.
Contouring effects start becoming visible at or below 6 bits/pixel. A mean square
quantizer matched to the histogram of a given image may need only 5 to 6 bits/pixel
without any visible .contours. Since histograms .pf imagl:ls vary Quite dr?sticalIy,
o '
-' ---
an s uare ' uantizers for raw image data are rarely u.~e~L A ,uniform
guantizer with 8 bits/pixe IS uS,!lally used. ~, ' .
-,--~..
.. -
,
----
In evaluating quantized images, the eye seems to be quite sensitive to contours
and errors that affect local structure. However, the contours do not contribute very
much to the mean square error. Thus a visual quantization scheme should attempt
to hold the quantization contours below the level of visibility over the range of
luminances to be displayed. We consider two methods of achieving this (other than
allocating the fullS bits/pixel).
Contrast Quantization
lI(m, 0)
Pseudorandom -
noise unitorrrr.
[-A,A! Figure 4.22 Pseudorandom noise quan-
• •
nzanon,
image, the same (or another) pseudorandom sequence is subtracted from the quan-
tizer output, J'hi< effect i~ that}o the regio~~I~:.::lumir:ancegr1l.die.Qts ,{which are
the regions of contours), the input noise causes pixels to go above or below the
original decision level, th'ereby6reaking tn~cOnfQllrs. However, the average value
of1Iie'quantized-' . ute same with and without the additive noise .
. During display, the noise tends to fill in the regions of contours in such a way that
the spatial average is unchanged (Fig. 4.23). The amount of dither added should be
kept small enough to maintain the spatial resolution but large enough to allow the
luminance values to vary randomly about the quantizer decision levels. The noise
should usually affect the least significant bit of the quantizer. Reasonable image
quality is achievable by a 3-bit quantizer.
Halftone Image Generation
•
<.
a b
Figure 4.23 c d (a) 3-bit image, con-
tours are visible; (b) g·bit image with
pseudorandom noise uniform over
[-16,16); (c) v'(rn,n), 3-bit quantized
v(rn,1I ); (d) image after subtracting pseu-
dorandom noise.
Threshold
v'
•
+ ,
+
0 A
• 0< III m , nl <A
Pseudorandom array
40 60 150 90 10
80 170 240 200 lID
HI -- 140
120
210
190
250
230
220
180
130
70
•
,•
20 100 160 50 30
in this text, are halftones. Figure 4.24 shows the basic concept of generating
halftone images. The given image is oversampled (for instance, a, 256 x 256 image
may be printed on a 1024 x 1024 grid of black and white dots) to coincide with
the number of dots available for the halftone image. To each image sample (repre-
senting a luminance value) a random number (halftone screen) is added, and the
resulting signal is quantized by a l-bit quantizer. The output (0 or 1) then represents
a black or white dot. In practice the dither signal is a finite two-dimensional pseudo-
random pattern that is repeated periodically to generate a halftone matrix of the
. same size as the image. Figure 4.25 shows two halftone patterns. The halftone
image may exhibit' if the image pattern and the dither matrix have
common or nearly common periodicities. Good halftoning algorithms are designed
to minimize the Moire, ~f.fect. Figure 4.26 shows a 512 X 512 halftone image gener-
ated digitally from the original 512 x 512 x 8-bit image. The gray level rendition in
halftones is due to local spatial averaging performed by the eye. In general, the
perceived gray level is equal to the number of black dots perceived in one resolution
cell. One resolution cell corresponds to ..
the area occupied by one pixel in the
original image.
Color Quantization
• •
a b
Figure 4.26 c d (a) Original 8-bitl
pixel image; (b) halftone screen H,; (c)
halftone image; (d) most significant I-bit!
pixel image.
its elements Cj, C2 , C3 representing the three color primaries. From Chapter 3 we
know the color gamut is a highly irregular solid in the. three-dimensional space.
Quantization of a color image requires allocating quantization cells to colors in the
color solid in the chosen coordinate system. Even if all the colors were equally likely
(uniform probability density), the quantization cells will be unequal in size because
equal changes in color coordinates do not, in general, result in equal changes in
perceived colors.
Figure 4.27 shows a desirable color quantization procedure. First a coordinate
transformation is performed and the new coordinate variables Tk are independently
quantized. The choice of transformation and the quantizer should be such that
the perceptual color difference due to quantization is minimized. In the NTSC
color coordinate Rj.b G N , BN system, the reproducible color gamut is the cube
[0, IJ x [0, IJ x [0, IJ. It has been shown that uniform quantization of each color
coordinate in this system provides the best results as compared to uniform quantiza- •
tion in several other coordinate systems. Four bits per color have been found to be
just adequate in this coordinate system.' ,
•
,
, T, T,', , WN
Ouannzer
4.1 In the RETMA scanning convention, 262.5 lines of each field are scanned in ;;\; s.
Show that the beam has a horizontal scan rate of 15.75 KHz and a slow downward
motion at vertical scan rate of 60 Hz.
4.2 Show that a bandlimited image cannot be space-limited, and vice-versa.
4.3 The image [(x, y) = 4 cos 41rx cos 6rry is sampled with oilx = D.y "" 0.5 and Ax = Ay =
0.2. The reconstruction filter is an ideal low-pass filter with bandwidths 6.:lx, lAy).
What is the reconstructed image in each case?
4.4 The NTSC composite video signal of (4.2) is sampled such that Ax = l~xO"Ay '" Ilf/.
At any given frame, say at t = 0, what would be the spectrum of the sampled frame for
the composite color signal spectrum shown in Fig, P4.4?
Chrorninance ,
spectra
I.
Luminance
/speetrum
~
Sen$(lr nolse
nix, y)
Ii. ·2;,
•
Figure P4.6 Sampling of noisy images.
•
124 . Image Sampling and Quantization Chap. 4
• •
,
SNRs of the reconstructed image with and without prefiltering are a}/( '1~}) and
<r}/(4'1tJ), respectively. What would be the SNR of the reconstructed image if the
sensor output were sampled at the Nyquist rate of the noise without any prefiltering?
Compare the preceding sampling schemes and recommend the best way for sampling
• •
nOIsy Images. .
4.7 Show that (4.15) is an orthogonal expansion for a bandlimited function such that the
least squares error
samples am." such that the reconstructed function IM,N (x ,y) ~ i: i: am." 4>m,n(x ,y)
m"'O n=O
minimizes the mean square error O"~f.N ~ ffL L E(/ f(x ,y) - !M,N(X ,Y)1 ] dx dy, Let 2
<Pm.n(X ,y) be a set of complete orthonormal functions obtained by solving the eigen-
value integral equation
JfL R(x, Yi X ', Y ')<Ilm.n(X', y') dx ' dy ' = Am.n <Pm.,,(x, y), -L sx,y -s L
•
a. Show that {am.,,} are orthogonal random variables, that is,
E[a m,,, am',n'] = Am,,, SCm - m", n - n ').
•
b. Show that <r1,N is minimized when {<jlmA are chosen to correspond to the largest
~ ~
g(x,Y)=i:i:S(x-2m,y-2n)+i:i:o(x-2m - 1,y-2n-l)
m,"
m"O
which is the. well-known product expansion of sinc(x - k).
b. Show that the Lagrange interpolation formula of (4,29) satisfies the properties'
•
f(mA) = f(mA)-that is, no interpolation error at known samples-and
•
flex) dx =.:ii: f(m.:i)-that is, the area is.preserved according to the trapezoidal
• •
nl
rule of integration; .
c. Write the two-dimensional interpolation formula for q I = q, = 1, 2, 3.
Chap. 4 Problems - ,- .
125
(Moire effect-sone-dimensional} A one-dimensional function f(x) = 2 cos 1'rsoX is
• sampled at a rate ~~, which is just above ~o. Common reconstruction filters such as the
zero- or first-order-hold circuits have a passband greater than ±~J2 with the first zero
crossings at :l:~s. as shown in Fig. P4.lla. .
a. Show that the reconstructed function is of the form _..- >
~- . -
---~-.
,,
j(x) = 2(a + b cos 2'lT~sx) cos 1'rSoX + 2b sin 2'l1"~sx sin 'If Sox
afiiH ("2
~o)
~,
.
~,H(~l
bAI/(" -!?),.
Ejjj:,
. "J~S
'.
I
I •
\ I
•
I •
t· }r"\
. ! •
~0
,n
i / . ~'" !
! • \ .
o
~o ~
--:I -:I
ta}
I
o
I
t
-1
-2 I ·
•
•
whicn is a beat pattern between the signal frequency ~"i2 and one of its companion
frequencies ~, - (~j2) present in the sampled spectrum.
b. Show that if the sampling frequency is above twice the highest signal frequency,
then the Moire effect will be eliminated. Note that if F;o = O-that is, f(x}=
constant-then the reconstructed signal is also constant-that is, the sampling
system has a flat field response.
. c.' As all example, generate a sampled signa! j;(k) '" 2 cos(k1T/L05), which corre-
sponds to ~o = 1, ~, = L05. Now plot this sequence as a continuous signal on a
line plotter (which generally performs a first-order hold). Fig. P4.11b shows the
nature of the result, which looks like an amplitude-modulated sine wave. This is a
Moire pattern in one dimension ..
4.12 (Moire effect-two dimensions) An irnage.j'(x , y) = 4 cos 4'l1'x cos 4'l1'y, is sampled at
a rate ~,,= ~Y' = 5. The reconstruction filter has the frequency response of a square
display spot of size 0.2 x 0.2 but is bandlimited to the region [-5, 5] x [-5, 5].
Calculate the reconstructed image. If the input image is a constant gray instead, what
would the displayed image look like?Would this display have a flat field response?
4,13 If t., r. are the decision and reconstruction levels for a zero mean,: unity variance
random variable u, show that i, = '" + ati, h = fl. + crs, ·are· the 'corresponding
quantities for a random variable v having the same distribution but with mean fA. and
variance r;2. Thus v may be quantized by first finding U = (v - fJ.}/r;, then quantizing u
by a zero mean, unity variance quantizer to obtain u', and finally obtaining the
quantized value of v as v' = [L + au' .
•
4.14 Suppose the compandor transformations in Fig. 4.19 are g(x) = r'(x) and
A. ( } - ["p,,(II)dll, u>O
w= f u - o
-f(-u}, u <0
where p,,(1l ) = p,,(-,,). This transformation (also called histogram equalization) causes
w to be uniformly distributed over the interval [-i, il. The uniform quantizer is now •
optimum for w. However, the overall quantizer need not be optimum for u.
. . a. Lt
e p" ()
II
_ {1-
O.
-
1
1I1
, -1<,,"';1
otherwise
and let the number of quantizer levels be 4. What are the decision and
reconstruction levels for the input u't Calculate the mean square error.
b. Show that this cornpandor is suboptimal compared to the one discussed in the text.
4.15 (Compandor transformations)
a. For zero mean Gaussian random variables, show that the compander transforma-
. tions are given by f(x) = 2 erf(xIV&r), x 02: O,and g(y) = V&r cn'(yI2), y 02:0,
where erf(x) ~ (1/y';) I~ exp( _yZ) dy. .
•
b. For the Rayleigh density , .
,
pu(~)= ;zcxP(-;:zz), ,,>0
•
0, a <0
•
show that the transformation is
..... - '
[(x) = c J: a ' 13
exp ~1
Z: da
•
,, nf = rl m ;. ( /) + *
that is, for uniform distributions, the one-dimensional optimum mean square
*
, quantizer is within bit/sample of its minimum achievable rate of its Shannon
,
. quannzer.
4.19* Take a 512 x 512 g·bit/pixel image and quantize it to 3 bits using a (a) uniform
quantizer, (b) contrast quantizer via (4.65) with 0: = 1, 13 = 11l, and (c) pseudorandom
noise quantizer of Fig. 4.22 with a suitable value of A (for instance, between 4 and 16).
Compare the mean square errors and their visual qualities.
. BIBLIOGRAPHY
Section 4.1
For scanning, display and other hardware engineering principles in image sampling
and acquisition:
Section 4.2
The two-dimensional sampling theory presented here is a direct extension of the
basic concepts in one dimension, which may be found in:
5. E. T. Whittaker. "On the Functions which are Represented by the Expansions of the
Interpolation Theory." Proc. Roy. Soc., Edinburgh, Section A 35 (1915): 181-194.
6. C. E. Shannon. "Communications in the Presence of Noise," Proc. IRE 37 (January
1949): 10-21.
•
For extensions to two and higher dimensions: .
Section 4.3 •
For extensions of sampling theory to random processes and random fields and for
orthogonal function expansions for optimal sampling: .
11. S. P. Lloyd. "A Sampling Theorem for Stationary (Wide Sense) Stochastic Processes,"
Trans. Am. Math. Soc. 92 (July 1959): 1-12.
12. J. L. Brown, Jr. "Bounds for Truncation Error in Sampling Expansions of Bandlimited
Signals," IEEE Trans. Inf. Theory IT-15 (July 1969): 440-444.
13. A. Rosenfeld and A. C. Kak. Digital Picture Processing. New York: Academic Press,
-. 1976, pp. 83-98.
Section 4.4
For aliasing and other practical problems associated with sampling:
15. R. Legault. "The Aliasing Problems in Two Dimensional Sampled Imagery." in
. Perception of Displayed Information, L. M. Biberman (ed.). New York: Plenum Press,
1973.
16. Special issue on Quantization. IEEE Trans. Inform. Theory. IT·28, no. 2 (March 1982).
17. A. K. Jain. "Image Data Compression: A Review." Proc. IEEE 69, no. 3 (March 1981):
349-389.
Section 4.7
20. P. F. Panter and W. Dite. "Quantizing Distortion in Pulse-Code Modulation with Non-
uniform Spacing Levels." Proc. IRE 39 (1951): 44-48.
21. B. Smith. "Instantaneous Companding of Quantizing Signals." Bell Syst. Tech. J. 27
(1948): 446-472. "
22. G. M. Roe. "Quantizing for Minimum Distortion." IEEE Trans. Inform. Theory IT-I0
(1964): 384-385. .
23. V. R. Algazi. "Useful Approximations to Optimum Quantization." IEEE Trans.
Commun. Tech. COM·14 (1966): 297-301.
•
Sections 4.8, 4.9
•
27. T. Berger. "Optimum Ouantizers and Permutation Codes." IEEE Trans. Inform.
Theory IT-16 (November 1972): 759-765.
•
28. A. N. Netravali and R. Saigal. "An Algorithm for the Design of Optimum Quantizers."
Bell Syst. Tech. J. 55 (November 1976): 1423-1435.
29. D. K. Sharma. "Design of Absolutely Optimal Ouantizers for a Wide Class of Distortion
Measures." IEEE Trans. Inform. Theory IT-24 (November 1978): 693-702.
Section 4.11
Here we follow:
31. N. C. Gallagher, Jr. "Quantizing Schemes for the Discrete Fourier Transform of a
Random Time-Series." IEEE Trans. inform. Theory IT-24 (March 1978): 156-163.
32. W. A. Pearlman. "Quantizing Error Bounds for Computer Generated Holograms,"
Tech. Rep. 6 503-1, Stanford University Information Systems Laboratory, Stanford,
Calif., August 1974. Also see Pearlman and Gray, IEEE Trans. Inform. Theory IT·24
(November 1978): 683-692. . . .
Section 4.12
33. F. W. Scoville and T. S. Huang. "The Subjective Effect of Spatial and Brightness
Quantization in PCM Picture Transmission." NEREM Record (1965): 234-235.
34. F. Kretz. "Subjectively Optimal Quantization of Pictures." iEEE Trans. Comm.
COM-23, (November 1975): 1288-1292. . .
35. L. G. Roberts. "Picture Coding Using Pseudo-Random Noise:" IRE Trans. Infor, ,
\
Theory IT-8, no. 2 (February 1962): 145-154.
36. J. E. Thompson and J. J. Sparkes. "A Pseudo-Random Quantizer for' Television
Signals." Proc. iEEE 55, no. 3 (March 1967): 353-355.
37. J. O. Limb. "Design of Dither Waveforms for Quantized Visual Signals," Bell Syst.
Tech. J., 48, no. 7 (September 1969): 2555-2583.
38. B. Lippel, M. Kurland, and A. H. March. "Ordered Dither Patterns for Coarse Quanti-
zation of Pictures." Fmc. IEEE 59, no. 3 (March 1971): 429-431. Also see IEEE Trans.
Commun. Tech. COM-13, no. 6 (December 1971): 879-889.
39. C. N. Judice. "Digital Video: A Buffer-Controlled Dither Processor tor Animated
Images." iEEE Trans. Comm. COM·25, (November 1977): 1433-1440.
40. P. G. Roetling. "Halftone Method with Edge Enhancement and Moire Suppression." •
J. Opt. Soc. Am. 66 (1976): 985-989.
•
41. A. K. Jain and W. K. Pratt.' "Color Image Quantization." National Telecomm.
Conference 1972 Record. IEEE Publication No. 72CH0601·SoNTC, Houston, Texas,
December 1972. .
•
. 42. J. O. Limb, C. B. Rubinstein, and J. E. Thompson. "Digital Coding of Color Video
Signals A Review." IEEE Trans. Commun. COM·25 (November 1977): 1349-1385.
,
~jJa\lti!il£iil~~'::IjAllai!ii$!ll!~.W!!ffli~)n.ULE_ll!l:ilIlI
1II_'Jli&D~*~~'\iIiiiilll~iwmllilllih_~_
5.1 INTRODUCTION
vector. An image transform provides a set of coordinates or basis vectors for the
vector space. "
For continuous functions, ..Qr!hogopi!Ls.eJ:· 'Q!1S provide ~ -
effi~it:njs which can be used for any further processing or analysis of the functions.
"For a one-dimensional sequence {u(n), O:s; 11. es N - I}, represented as a vector u of
size N, a unitary transformation is written as
N-l
N-l
" O:s;n<N-1
:::;>u(n) =2: v(k)a* (k, 11.1 (5.2)
"" .<=0
Equation (5.2) can be viewed as a series representation of the sequence u(n). The
columns of A *T, that is, the vectors at 4.{a' (k, s), 0 :s; 11. S N - 1V are called the"
basis vectors of A. Figure 5.1 shows examples of basis vectors of several orthogonal
-transforms-encountered in image processing. The series coefficients v(k) give "a
representation of the original sequence u(n) and are useful in filtering, data
compression ,feature extraction, and other analyses.
•
•
132 •
•
,
Cosine
,,
•
5ine ~Iadama'd Iia.r Slant l<LT
• •
,
k~' I I I L-Lr-r-rr
t;2
,
,,
, •
•
,
"
\
k';,3
t=4
.,
1.:=5
.1.:=6
'''IT
,
,~
k=1 ,
I
I
t
I r
~~
•
...
Ii.> •
01234561
,
01:234567 01234567 01234567
,
01234567 01234567
W •
Figure 5.1 .Basic vectors of the 8 X 8 transforms.
,
"
5.2 TWO-DIMENSIONAL ORTHOGONAL
AND UNITARY TRANSFORMS
•
wbere {ak. I (m, n)}, called an image transform, is a set of complete orthonormal
- . "
? Orthonormality: 2::L
m. n
"u(m, n)«:-,I' (m, n) = o(k'- k:', I-I')
= 0
(5.5)
N-I .
',i
....-l1'Completeness: LL
i,I-O
«k,l(m,n)ak,dm',n') =o(m -m',n -n')
.
(5.6) .
. The elements v (k, I) are called the t1]ln§fQrrncoe[ficients and V ~ {v (k, I)} is
called the lransforl'rled image. ,The .9rthQri<;!!:m~l!1x property assures that any trun~
cated series expansion of the form •
p. 1 Q - [ .
up,Q(m.n)~.L: .L: v{k,/)«t,dm,n), P::=;N, Q::=;N (5.7)
.-01-0
•
will minimize the sum of squares er~2:
1'/0 / "'} N-J
,I ,' "
- 'ir~ = .L:L [u(ni, n) - UP.Q (m, nW (5,8)
m,Il-'='O
when the coefficients v(k./) are given by (5.3). The cQmpletepe§.(l property assures
that this error will be zero for P = Q = N (Problem 5.1).
.
«k,l(m, n) = ak(m)bl(n) ~ a(k, m)b(/, n) (5.9)
"
where {adm), k = 0, , .. ,N - I}, {b i (n), I = 0, ... ,N -l} are one-dimensional
complete orthonormal sets of basis vectors. .lm . . . . '
A ~ {a(k, m)}
.--.- .....
---_~
and B ~ \l!QI-Ellihould bt'l unit~!1 m~i~~ thems~l'y~,ior example,
.
T (5.10) .
AA*T·=A A* =1
• . ,
Often one chooses B to be the same as A so that (5.3) and (5.4) reduce to
•
For an M x N •..
~ectang.ula.tjm.l!ge,
,",
the transform pair is
V=AMUAN (5.13)
U =' AAiTVA,V (5.14)
where AM and AN are M x M and N x N unitary matrices, respectively. These are
called two-dimensional separable transformations. Unless otherwise stated, we will
always imply the preceding separability when we mention two-dimensional unitary
transformations. Note that (5.11) can be written as
VT=A[AUF (5.15)
~~!insJ5.1l) can be performedlJyjir:st!!ll.~sforming~ach
c olum
then transforming each row of the result to obtain tnerows QfY.
'Z ' ~.~ _".
. '"~~- .. _.... -.~.-.~"' ... __.-.._- _ _ ~... •
Basis Images
Let at denote the kth :oll\!1ln ,Of A "T. Define the matrices
"
A k.I- 8k,,*T
81 (5.16)
•.
and the matrix inner product of two N x N matrices F and G as •
N-1N-l
(F, G) = 2: 2: f(m, n)g* (m, n) (5.17)
m""On=O
Then (5.4) and (5.3) give a series representation for the image as
J N- 1 f
IU = ~~ v(k, l)AZ,1 I . (5.18)
basis images for the same set of transforms in Fig. 5.1. The transform coefficient
v (k, I) is simply the inner product ofthe (k, /)th basis image with the given image. It
is also called the projection of the image on the (k, /)th basis image. Therefore, any
. ". N x N Image can be expanded in a series usingE .compl!:teset of N 2 basis images. If
U and V are mapped into vectors by row ordering, then (5.11), (5.12), and (5.16)
yield (see Section 2.8, on Kronecker products) . .-
\ lJ'=.(A@A)U'~vtU' (5.20)
•
)~:= (A@A)*To-=vt*To- : (5.21)
•
where •
(5.22)
" ,~
.--- .~"'" '. -~-~'
" , ,, I
, ,
" ~ .
• .",
"" '
~
~",
, 4 ",',J
"
,t •
''I' ,-,
~,- ,• "f f
~~
• T•
),.;
,,
...."'.'",
"
,.-
, • , , , ,
,•, , •
,0
•
~
--... .'
,
"....• ,'J" , ..
·, "-- ,--,
·. ..
'" ...
. """}"
~ ,
.' .~It"".
'
,
iI ~ .... ,
h--.
'" '
,~t:?:JfI'
r~
."!'
n, ,,;-"""""
, ,
rifT"
,'-'","
.. _.~
- - " ,
,,~ .'~
,
,
\
,
~:.
If~,J
, " '"
.':: ,-
, '.,
.':4
" " IF
i/O'
. ~.--
,
,
,
," ,"
•
. .' ,....,..--
, ,•
.... "
•
,,
-",--,"
oL__' __
._-}
. _••",. 0, , ,
" .. •,• , ••
",,
• .'
,'-'. .'
,
.~,
"
"...... is 1-' "
1.1 ..~ ."
, ,."
......
"-.'~"""
,
,--- "
, , , .
·, .,..
,
"'
"
A =~ (i -D,. U = (; ~)
the transformed image, obtained according to (5.11), is
,,
I
V
_ I
-2
(11 .~11)(13 2)4, (1\1 -11) ~ -24 :"'2
-2
1( ,6)(11 -1'1) -_(-25 -1)0
I . ' ',
To obtain the basis images, we find the outer product ofthe columns of A' r, which gives
A;.o=i(D(1, l)=~(i i)
, and similarly
•
,•
•,
, ,(1 -1)
(iO.l =" i: -1 =A,.o,
«r
,
•
Kronecker Products and Dimensionality
,,
,ff =./t.v (5.23)
is called separable if
(5.24)
" This is because (5.23) can be reduced to the separable two-dimensional trans-
formation .
. , (5.25)
•
.where X and Yare matrices that map into vectors ,» andJl', respectively, by row
ordering. If vt is N X N and A, , A2 are N x N, then the number of operations
2 2
required for implementing (5.25) reduces from N 4 to' about 2N 3 • The number of
operations can be reduced further if A j and A2 are also separable. Image transforms
such as discrete Fourier, sine, cosine, Hadamard, Haar, and Slant can be factored as
Kronecker products of several smaller-sized matrices, which leads to fast algorithms
for their implementation (see Problem 5.2); In the'context of image processing such
matrices are also calledJast ~'Eage transforms.
•Transform Frequency
For a one-dimensional signal I(x). frequency is defined by the Fourier domain
. variable ~. It is related to the number of zero crossings of the real or imaginary part
of the basis function exp{j21f~x}. This concept can be generalized to arbitrary
unitary transforms. Let the rows of a unitary matrix A be arranged so that the
number of zero crossings increases with the row number. Then in the trans-
formation
• y=Ax
the elements y(k). are, ordered ag;ording to inc~sing !!:'m'e numker o~ transtor~
fteg,u§1:l(_·Y.~ In the sequel any reference to freque!1cY" willimply the transform
frequency, that is, discrete Fourier frequency, cosine frequency, and so on. The
term spatial frequency generally refers to the continuous Fourier transform fre-
quencyand is not the same as the discrete Fourier frequency, In the case of
• Hadamard transform, a term called sequency is also used. It should be noted that
this concept of frequency is useful only on a relative basis for a particular transform.
A low-f ~, t:t:a.nstQrm cpuld GIl..ntainJ.he..high-fre';;h\Cnc.Lh(jrmol1!f§
of another transform. 7' , "
.~-------""'"" •
fifteril!lL~nd data comgre,ssi.on 9,f imqges ba~ed all the J!l.£'.l!P s,Q.llare..J:rite.cion. The
Karhunen-Loeve transform (KLT) is known to be optimum with respect to this
criterion and is discussed in Section 5.11.
,J:h 'nit ation preserves the SI nal energy or, equivalently, the_
JC1lg.th of the ve<:;!.Q; U in !~lt([me.nSiilliiiLYcctor SR~ce~ _.18- m~nseverY~\lnitilry_.
~nsfo!!,'.ationi§ sim '. . ·.-the.Yec1or u in the N -dim~ 'lector
space. Altern tiv ;furmatiof!)s a rotation of the b'.!§iu;..~
IDlQ.i.he comp9nents of v are J;.lJ.~.prQJ~tions~tu.D.a~basis (see Problem SA).
Similarly, for the two-dimensional unitary transformations such as (5.3), (SA), and
(5.11) to (5.14), it can be proven that ~.
•
)11-1 )11-1
L
k=O
E(lv(kW] "" L E[lu(n)1
n=O
2
] . (5.34)
The average energy E[!;'(k)j2J of the transform coefficients v(k) tends to be un-
evenly distributed, although it may be evenly distributed for. the input sequence
u(n). For a two-dimensional random field u(m, n) whose mean is IJ.u(m, n) and
covariance is rtm, n; m', n '), its transform coefficients v (k, l) satisfy the properties
~ [AR,A*1]k.k[AR2A*']I,1 (5,38)
•
where
•
Decorrelation
hen tlie input vector elementsar€;Jlighl:x.£on~l<1teJJ._the-1r.im~fo[11l~£Qeffi~.!!.ts
tend to~. e unco.rrelated. This means the off-diagonal terms of the COvl!ri<!lli:cmatrix
,!! tend to become §.!!lill . 0 the ~oiiaI elemenis:·~~- .. .~~
WitJi"respe<:t to the preceding two properties, the KL transform is optimum, .
that is, it packs tile maximum average energy in a given number of transform
coefficients while completely, decorrelating them. These properties are presented in
greater detail in Section 5.11.
Other Properties
Unitary transforms have other interesting properties. For example, the determinant
and the eigenvalues of a unitary matrix have unity magnitude. Also, the entropy
of a random vector is preserved under a unitary transformation. Since entropy is
a measure of average information, this means information is preserved under a
unitary transformation.
Example 5,2 (Energy compaction and deeorrelation)
A 2 x 1 zero mean vector u is unitarily transformed as
.: 1
v = '2
(\I3t' 1~
-1 v3} u,
v whereRu =
t> (1p Ip)' O'<p< 1
'.
The parameter p measures the correlation between u (0) and u (1).. The covariance of v
is obtained as
R = (1 + v3pl2 ·pl2. )
v \ p/2 1 - v3p/2
Fromthe expression for R. ,(1';'(0) = (1';(1) '" 1, that is, the total average energy of 2
is distributed equally between u(O) and u(l). However, (1~(0) = 1 + v3pl2 and
(J"~(1) = 1 - v3p/2. The total average energy is still 2, but the average energy in v(O) is .
greater than in v(l). If p = 0.95, then 91.1% of the total average energy has been
packed in the first sample. The correlation between v (0) and v (1) is given by •
which is less in absolute value than ipl for Ip! < 1. For p = 0.95, we find p,,(O, 1) = 0.83.
Hence the correlation between the transform coefficients has been reduced. If the
foregoing procedure is repeated for the 2 x 2 transform A of Example 5.1, then we find
(1":(0) = 1 + p, 0';(1) = 1- p, and p,(O, 1) = O. For p '" 0.95, now 97.5% of the energy is
packed in v(O). Moreover, v (0) and vCl) become uncorrelated. "
where
A
WN=exp {:-i N
2'1T}
(5.40)
The pair of equations (5.39) and (5.41) are not scaled properly to be unitary
transformations. In image processing it is more convenient to consider the unitary
DFT, which is defined as
N-l
•
v{k) = INN L u(n)W~,
n=O
k = 0, ... ;N-l (5.42)
, 1 N-l
u(n) = "\IN L v (k)W./ n
, n = 0, ... , N-l (5.43)
N k=O
The N x N unitary DFf matrix F is given by
Future references to DFf and unitary DFf will imply the definitions of (5.39) and
'(5.42), respectively. The DFf is one of the most important transforms in digital
, signal and image processing. It has several properties that make it attractive for .,
image processing applications.
o 1 234 o 1 2 3 4
n • n • Figure 5.3 Circular shift of u (n) by 2.
The OFT and Unitary OFT matrices are symmetric. By definition the
•• matrix F is symmetric. Therefore, .
(5.45)
The extensions are periodic. The extensions of the DFT and unitary DFT
of a sequence and their inverse transforms are periodic with period N. If for
example, in the definition of (5.42) we let k take alI integer values, then the
sequence v (k) turns out to be periodic, that is, v (k) == v (k + N) for every k.
The OFT is the sampled spectrum of the finite sequence u (n) extended
. by zeros outside the interval [0, N - 1]. If we define a zero-extended sequence
u(n)A u(n), O~n~N-l (5.46)
0, otherwise
•
then its Fourier transform is
<'Xl N-l
•
Comparing this with (5.39) we see that
The DFT or unitary OFT of a real sequence {x(n),n =O, ... ,N-l} is
conjugatesvmmetric about N12. From (5.42) we obtain
•
N-I ,v-1
2: u;(n)W,v(N-k)n= 2: u(n)Wt"=v(k)
v*(N-k}= •
n=O n=O
'-A--::-:"' ~.-.-r--~'
. , ,.
•
•
•
Figure 5.4 shows a 256-sample scan line of an image. The magnitude of its DFT is
shown in Fig. 5.5, which exhibits symmetry about the point 128. If we. consider the
.
200
•
150
100
,,(Il).
50 • •
o I..-_..J.._-..I"--_.J..._...J._ _ - -
o 50 100 150 zoo 250
FigIIref>.4 A 256-sampJe scan line of an
)it' n Image. •
•
89 •
70
.,
56
vCt)
42
•
28 •, ,
19
0.D1
o 100 150 200 250
• •
Figure 5.S Unitary disc. :te Fourier
k "...... transform of Fig. 5.4.
t N } { . N } (N\
v (0), { Re{v(k)},K=l""'Z-l, Im{v(k)},k=l""'Z-l ,V\Z) (5.51)
completely defines the DFf of the real sequence u(n). Therefore, it can be said that
the D.FT or unitary DFT of an N x 1 real sequence has N degrees of freedom and
requires the same storage capacity as the sequence itself.
The basis vectors of the unitary OFT are the orthonormal eigenvectors
of any circulant matrix. Moreover, th~ eigenvalues of a circulant matrix are
given by the OFT of its first column. Let H be an N x N circulant matrix.
Therefore, its elements satisfy , .
• [H}rn.• = hem - n) = h[(m - n) moduloN}, Osm, n <N -1 . (5,52)
The basis vectors of the unitary DFT are columns of F* T = F* ,that is,
. T
<l>k = .IN WiV lm
,0 s n S N - 1 , k = 0, ...• ,N -1 . (5.53)
(5.54)
Using (5.52) and the fact that W",' = W%-I (since W% = 1), the second and third
terms in the brackets cancel, giving the desired eigenvalue equation
[H<I>kJm = Ak<l>k(m)
or
(5.56)
where Ak , the eigenvalues of H, are defined as •
IV-I
Ak~
,
2:
[",,-0·
h(l)W~, Osk sN-1 (5.57) .
N-l
X2(n) = L h (n - k)cXl(k), O:Sn:sN-1 (5.58)
k=Q
,
then
•
DFT{X2(n)}N = DFT{h(n)}NDFT{Xl(n)}N (5.59)
where DFT{x(n)}N denotes the DFT of the -sequence x(n) of size N. This means we
can calculate the circular convolution by first calculating the DFT oh2(n) via (5.59)
and then taking its inverse DFT. Using the FFT this will take O(N log2N) oper-
ations, compared to N 2 operations required for direct evaluation of (5.58).
A linear convolution of two sequences can also be obtained via the FFT
by imbedding it into a circular convolution. In general, the linear convolution
of two sequences {h(n),n =0, .. :,N' -1} and {xl(n),n =0, ... ,N -1} is a se-
quence {xz(n), O:s n :s N' + N - 2} and can be obtained by the following algorithm:
Step 1: Let M> N' + N - 1 be an integer for which an FFT algorithm is available.
Step 2: Define Ii(n) and xl(n), O:s n :S M -1, as zero extended sequences corre-
o spending 10 hen) and x.(n); respectively.
Step 3: Let Y1(k) = DFT{xl(n)}M, Ak ':'" DFT{h(n)}M' Define y.(k) = AkYl(k), k =
. 0, ... ,M - 1 . . . ;
Step 4: Take the inverse DFT of y;(k) to obtain x2(n). Then x2(n) =x2(n) for
O<n:sN + N t - 2 . .
- . - - ' .....-.......,
utm, n) (5,64)
',' .. .
~,"r,
~,,~
,~
, ' ".
L
"
,
• ,
nd~ ... , •
la) Original image; (bj phase;
,
t:
, ,;. ., , "
. '
(
" < " ,
•"
, , '
..
, '
" ..,
-.<
,
, ,''':1
'j
•
'<i
«I ;.
>;,i
",
'i~,
,
,I a b!
, ; l' .~
-, ,.,,~ c dl
'.~,.4t Figure 5.7 Unitary DFT of images
'::'
,
(a) Resolution chart;
{l
(b) its DFT;
f' (c) binary image;
•
,f 1
./-.
(d) its DFT. The two parallel lines are due
to the 'l' sign in the binary image.
U=F*VF* (5.66)
If U and V are mapped into row-ordered vectors Ii' and 0', respectively, then
, ., (5,67)
g;=F(8)F (5,68)
The N 2 x N 2 matrix g; represents the N x N two-dimensional unitary DFf. Figure
5.6 shows an original image and the magnitude and phase components of its unitary
DFf. Figure 5.7 shows magnitudes of the unitary DFfs of two other images.
The properties of the two-dimensional $lt8r,y grT are quite similar to the one-
dimensional case and are summarized next.
Symmetric, unitary.
g;T=g;, g;-l = gr- = F* (8)1'* (5.69)
Periodic extensions.
+ N, 1 + N) = v{k, I),
v(k 'rIk, l
" (5.70)
,
utm + N, II + N) = utm, n), 'rim, n
Conjugate symmetry. The OFT and unitary OFT of !if!!{ i,!!,!!~s exhibit
conjugate symmetry, that is,
or
v(k,l)=v*(N-k,N-l), Osk,lsN - 1 (5.73)
From this, it can be shown that v(k, 1) has only N 2 independent real elements. For
example, the samples in the shaded region of Fig. 5.8 determine the complete OFT'
or unitary OFT (see problem 5.10).
Basis images. The basis images are given by definition [see (5.16) and •
(5.53)]: .
•
A k,* I -- A. "" -
'l'k'l'l -
1-
N {w-(km
N
+ In)
, Osm,n sN -I}, Osk,lsN-l . (5.74)
•
where I
•
INI2) - 1
N/2 •
n n'
•
.---,
N -1 ..; - - - - - - - , '" im, nl d--_~-_I :
..I
•
him, n) = 0
h(m - m', n - n'}c
Figure 5.9 shows the meaning of circular convolution. It is the same when a periodic
extension of hem, n) is convolved over an N x N region with ul(m, n). The two-
dimensional DFT of h (m - m', n - ,n ')c for fixed m t n' is given by
•
N-IN-l N-l-m'N-l-n'
:z:
m=O n=(}
Lh(m-m',n-n')cw>vmHnl)=W}j','Hn'l) L
i=-m'
2:
)=-n'
h(i,j)c Wj4H j i)
N-IN-t
= WW'k+n'l) L L hem; n)W~mk+nI) (5.77)
m "" 0 n "" 0
= w~m'k+n'I)DFt{h(m, n)}N
where we have used (5.76). Taking the DFT of both sides of (5.75) and using the
•
preceding result, we obtain'
DFT{u2(m, n)}N = DFT{h(m, n)}NDrl{ul(m. n)}N (5.78)
From this and the fast transform property (page 142), it follows that an N x N
circular convolution CjlfJ. be performed in O(N 2 10gzN) operations. This property is
also useful in calculafing two-dimensional convolutions such as
M-I M-I
where xI(m, n) and x2(m, n) are assumed to be zero for m, n rt:. [0, M -1]. The
region of support for the resultx3(m, n) is {O::5 m, n ::5 2M - 2}. LetN;:: 2M --I and
, .define N x N arrays
-( )' .:i rx2(m, n), O::5m,n::5M-l
h m.n
' =l 0, otherwise
(5.80)
•
•'(' ., ).:i xI(m, n), 0::5 m, n ::5 M - 1
UI m, n = (5.81)
0, otherwise
'We denote DFT{x(m. n)},y as the two-dimensional DFT of an N x N array x(m, n),Osm, n s N-1.
where
nCO) =
A fi
'.IN' .o.(k)~ :.J~ for 1 <k:::;;N-l , (5.88)
•
•
822
254
65
-503
The basis vectors of the 8 x 8 DeT are shown in Fig. 5.1. Figure 5.10 shows the
cosine transform of the image scan line shown in Fig. 5.4. Note that many transform
coefficients are small, that is, most of the energy of the data is packed in a few
transform coefficients.
The two-dimensional cosine transform pair is obtained by substituting
A = A* = C in (5.11) and (5.12). The basis images of the 8 x 8 two-dimensional
cosine transform are shown in Fig. 5.2. Figure 5.11 shows examples of the cosine
transform of different images.
•
Properties of the Cosine Transform •
!
,-
.r)
,
.
--
\Ii>
- "',
:"_,
•
• •
.. .' I
(a) Cosine transform examples of monochrome irn- (b) Cosine transform examples of binary images.
ages; ,
Figure 5.11
•
152 Image Transforms
•
•
which proves the previously stated result. For inverse cosine transform we.
write (5.89) for even data points as
N-l
u(2n) = u(2n) ~ ReL [a(k)v(k)e j" kI2A] e j2" nklN ,
. )c;= D
(5.93)
•
O~n ~(~)-1
The odd data points are obtained by noting that
(5.95)
Qc= -a
1~
1 -a.
o
-a 1- Ct
The proof is left as an exercise.
•
• 6. The N x N cosine transform is very close to the KL transform of a first-order
stationary Markov sequence of length N whose covariance matrix is given by
(2.68) when the correlation parameter p is close to 1. The reason is that R- 1 is
a f?mmetric tridiagonal matrix, which for a scalar ~2 ~ (1.-. p2)/(1 + p2), and
a = p/(1 + p'2) satisfies the relation . . .
•• •
1-pa -a
o
•
1",
. 1 -a
o -a 1-pa
Hence the eigenvectors of R and the eigenvectors of Q<, that is, the cosine
transform, will be quite close. These aspects are considered in greater depth in
Section 5.12 on sinusoidal transforms.
rThis property of the cosine transform together with the fact that it is a fast
\ transform has made it a useful substitute for the KL transform of highly
l£orrelated first-order Markov sequences. '
"
The N x N sine transform matrix \{1 = {Ij/(k, n)}, also called the discrete sine.tz:aIJ§-
.f!!rm (DST), is defined as
•
r
2 . 1'i(k + 1)(n + 1) ,
ljI(k, n) = N+ i sin- N +1 , O s k, n s N - 1 (5.98)
,
•
Unlike the previously discussed transforms, the elements of the basis vectors of the
Hadamard transform take only the binary values e I and are, therefore, well suited
for digital signal processing. The Hadamard transform matrices, Hn , are N x N
matrices, where N ~ 2 , n = 1,2,3. These can be easily generated by the core
n
• •
matrix
•
Hn=Hn-l®HI=Hl®H~_I=~ H
,
n- I
2 Hn -
(5.104)
1 •
,
Sec. 5.8 TheHadamard Transform 155
•
, which gives
,..
Sequency
r
1 1 1. 1
1 -1 1 -1
•
I 1
1 -
' 1
1 1 11
1 :'-11
0
7
I .s,
1 1 -1 -1 I 1 1 -1 -1/ 3
oc-<
•
1
,1 -1 ..,1' 1 I 1 -1 -1 1' 4
H3 =v'8 :::....,--_..:-_+------ (5.107)
1 1:' :~ 1 1-1 -1 -1 -1
1 - •
1
1 -1.1-1 1-1 1 -1 1 6
1 1 -1 -1 1-1 -1 1 1 2
1 -1 ~r 1 •1-1 1 1 -1 5
•
The basis vectors of the Hadamard transform can also be .zenera~ed by samf!ling ~
.slass of functions called the WalshJlIrl'lct,igns. These functions also take only the
binary values ±1 and form a complete orthonormal basis for square integrable
functions. For this reason the Hadamard transform just defined is also called the
Walsh-Hadamard transform. "
The number of zero crossings of a Walsh function or !he number of transitions
in a basis vector of the Hadamardtransform iscal1edJts s.eq.u.euCJi. Recall thatfor
sinusoidal signals, frequency can be defined in terms of the zero crossings. In the
Hadamard matrix generated via (5.104), the row vectors are not sequency ordered.
The existing sequency order of these vectors is called the Hadamard order. The
Hadamard transform of an N x 1 vector uis written as
v=Hu (5.108)
and the inverse transform is given by
u=Hv (5.109) ,
,where H ~ Hn .n = log, N. In series form the transform pair becomes
N-l
v(k)::= 1 £..,
'" u(m)(_l)b(k.ml, OsksN-l (5.110)
~ VNm~o ~
. N-l
u(m)= 1 2: v(k)(-l)b(k.ml, OSmsN-l (5.111)
VNk~O ,
where
n- I
and {k j } , {mi} are the binary representations of k and m, respectively, that is,
1k
k=ko+2k1 + ' " +2n - n_ 1
~
_
" -, ,~).\\" t
~,:;:-._' ! -
~-,. - .-., -. ,.
-- -,--
~>-
<: •
_i
~
- -, "
.. (
''-
'-
_., - .. ..' .
I
,
- - -- --
. -'-
1
•
, , .
q - -
jp~
~
.. , ~,.
)
"
f·
- ~",
,t I•
I·-
;-
*
~. 1O~_"',4_ ••, 0.~~··_'_·~ __ '_.' ,', -'.. ' _._.,0""",_~
(al Hadamard transforms of monochrome images, (b) Hadamard transforms of binary images.
images of the Hadamard transform are shown in Figs. 5.1 and 5.2. Examples of
two-dimensional Hadamard transforms of images are shown in Fig. 5.13.
H = II' = H T = Ir· .
(5.114)
2. The Hadamard transform is a fast transform. The one-dimensional trans-.
formation of (5.108) can be implemented in O(N logzN) additions and sub-
. tractions. ._' . .
•
Since the Hadamard transform contains only ±1 values, no multiplica-
tions are required in the transform calculations. Moreover, the number of
additions or subtractions required can be reduced from N Z to about N 10& N.
This is due to the fact that lin can be written as a product of n sparse matrices,
that is,
n =lo~N (5.115)
where. , \,-
.. ,.0'. ,
t'2 rows ' ,
•
, 1" 1.
1 0 •
,
0 0 1 .. 1 0 0
•
••
• •• ••
•• • •
0 0 1 1
•
------------- •
• • •
1 (5.116)
1 -1 0
0 0 1
0
-1 r-
N'
•
•• ••
•
••
•
••
• , '2 rows
0 0 • • • 1 -1 t
Sec. 5.8 The Hadamard Transform 157
•
- .
Since H contains only two nonzero terms per row, the transformation
_ _ 'W ".., •
V = H~ u = HH . , . Uu,
~-, ....
.n=logzN (5.117)
n terms
. - -
can be accomplished by operating H n times on u, Due to the structure of H
-
only N additions or subtractions are required each time H operates on a
vector, giving a total of Nn = N log:: N additions or subtractions.
(3. The natural order of the Hadamard transform coefficients turns out to be
equal to the bit reversed gray code representation of its sequency s. If the,
sequency s has the binary representation bnbn- 1 ••• b, and if the correspond-
ing gray code is gngn-l" .gJ, then the bit-reversed representationg-ga . . .g;
gives the natural order. Table 5.1 shows the conversion. of sequency s to
• natural order h, and vice versa, for N = 8. In general,
gk= b k (f1 bk H , ,,= 1, ... ,n-l ,
• gn=bn (5.118)
,
and
•
gk= h.- k + 1
bk ' gk(f1b k+ 1, k =n -1, .. ,,1 (5.119)
bn=g.
give the forward and reverse conversion formulas for the sequency and natural
ordering.
. 4. The Hadamard transform has good to very goo,Lener~
highly correlated images. let {u (n), 0 S n S N - I] be a stationary random
--=-.
....=--_--~---'=-.-
(l-~r(k)
1
1 +/I
k~l 2 reO)
(5.120)
2J
a
,R={r(m-n)} . (5.I21)
where Vi are the first NI]} sequency ordered elements Di . Note that the V k
are simply the meaJ;l~lus:s
.
.. . IIYh. The .'
significance of this result is that (; (NI2!) depends on the first 21 auto-
correlations only. For j = 1, the fractional energy packed in the first NIZ
scquency ordered coefficients win be (1 + r(I)/r(O)/2 and depends only upon
the one-step correlation p ~ r(I)/r(O). Thus for p = 0.95, 97.5% the total of
energy is concentrated in half of the transform coefficients.. The result of
(5.120) is useful in calculating the energy compaction efficiency of the
Hadamard transform. .
,
Example 5.3
Consider the covariance matrix R of (2.6&) for N '" 4. Using the definition of H 2 we
• • •
ootam . . ,
Sequeney
4 + 6p + 4p2 + 2p3 o ,
4 - 6p + 4p2 - 2p3 3
D= diagonal [H2 RHz] = ~
4 + 2p - 4p2 - 2p' 1
o 4 -2p - 4p2 + 2p3 2
This gives Do = Do , Dj = D, , D~ = D m = D
3, 1 and
as expected'according to (5.120).
I •
,.
where Osp Sn - 1; q = 0, 1 for p = 0 and 1 S q s2P for p "" O. For example, when
N=4,nh~ .
•
•
•
k 0 1 2 3
• , ,
p 0 0 1 1
•
•
•
q!0 1 1 2
•
A 1
ho(x) = ho.o(x) = VN ' x E [0, I}. (5.123a)
, •
q -1 q_1
~-<x< 2
..
' .. ",
. 1 2" 1 - 2"
h k (x) ~ hp• q (x) = VN q-2
S X
<!L (5.123b)
2P 2P •
0, otherwise for x E [0, 1]
The basis vectors and the basis images of the Haar transform are shown in Figs. 5.1
and 5.2. An example of the Haar transform of an image is shown in Fig. 5.14. From
thestructure of Hr (see (5. 124)J we see that the Haar transform takes differences of
the samples or differences of local averages of the samples of the input vector.
Hence the two-dimensional Haar transform coefficientsy(k,l), except for
k = l = 0, are the differences along rows and columns of the local averages of pixels
in the image. 'These are manifested as several "edge extractions" of the original
image, as is evident from Fig. 5.14.
Although SOn;1e work has. been done for using the Haar transform in image
data compression problems, its full potential in feature extraction and image anal-
ysis problems has not been determined.
•
~::~,:~,~:':;<~~_;1,':;,~ ;:'.:: i, . r
,
'C'
o
{.
'-'.,1""i_>'~_''''
.
~'1f'~)-.. -,,' , -,
::,,)''''i'';,': . -
".
e',
~'"-'
,~ , ;,'- ., "
"""'-'/"
r::;~·;/~,;;,.·,,"
"._ 'F . ~,
,,__ ,
",_,
v"-""
,.-., ''''''''~' ,"
,,"
):-1..'''''
,.-.,-
r,; -'"",',
.
.
",.
~,::" '~', . ,,'
ft
~:\, 'j
r,
"". ','
~'. ~
I-.
~.
\'
'~
Figure 5.14 Haar transform of the 256 x 256 Figure 5.15 Slant transform of the 256 x 256
image shown in Fig. 5.6a. image shown in Fig. 5.6a.
8 - 1 [1 1] (5.127)
I ~ v'2 1 - 1 ,
,
•
The parameters an and b; are defined qy the recursions ,
•
. Figure 5.1 shows the basis vectors of the 8 x 8 Slant transform. Figure 5.2 shows the
basis images of the 8 x 8 two dimensional Slant transform. Figure 5.15 shows the
Slant transform of a 256 x 256 image. .
1=- sequency=2
2'
•
N . Z(i -~) + 1, •
i = even
Z+2S.isN-l, sequency=
2(i -~), i = odd
-
U = 4>v == 2: V(k)~k (5.134) .
k=O
where ~k is the kth column of W. From (2.44) we know 4> reduces R to its diagonal '.
form, that is,
(5.135)
We often work with the covariance matrix rather than the autocorrelation
matrix. With f.t ~ E[u], then .
Ro a. cov[u] ~ E[(u - f.t)(u - f.tYl = E[uu T ] - f.tf.tT = R -"f.tf.tT (5.136) •
•
= (N;A./'\in w.(m+l·
N;l)+
(k~l)'lT, O<m,ksN-1
Example 5.S .
•
, Since the unitary DFT reduces any circulant matrix to Ii diagonal form, it is
!r<llU ue' taeorrelaljQll.IDatrices~lhat~,for
. .odi quenees-.
The DCT is the KL transform of a random sequence whose autocorrelation
matrix R commutes with Qc of (5.95) (that is, if RQc =' QcR). Similarly, the DST is thtl
KL transform of all random sequences whose autocorrelation matrices commute with
Q of (5.102). . .
•
KL TransfOrm of Images'
If an N x N image u(m, n)is re,eresented by a random field whose autocorrelation
function is given by .. ' '" ",." "..... '"' ..
E[u(m,n)u(m ', n')]=r(m,n;m', n'l, Osm, m', n, n' sN -1 (5.139)
•
N-I N-l
L L r(m, n;m'; n ')l.\Ik,/(m I, n')
m:"'On'=()
(5.140)
= Ak,ll.\Ik,l(m, n), 0 s k , 1 sN -1, Osm, n sN-1
In matrix notation this can be written as ,,'
-,
inlftj = A,"",it i = 0, , .. , N 2 - 1 (5.141)
where 1ft! is an N2 x 1 vector representation of l.\I.,1 (m, n) and in is an N Z x N Z
Z
autocorrelation matrix of the image mapped into an N x 1 vector (,p. Thus
in = £[",U'1] (5.142)
If f!t is s~arabl~, then the N 2 x N2 matrix 'It whose columns are {""J 'beComes
.separable (see Table 2.7). For example, let
;. ...
,.,.... , ,
(5.145)
where
«I>R«I>*1'
/11 = Aj ,. 'J' = 1, 2 (5.146)
and the KL transform of Ib is
o- = 'Put< r Ie' = [«I>i r ® 4J! 1]1e' (5.147)
For row-ordered vectors this is equivalent to
J:$i2Y(l;" . (5.148)
and the inverse KL transform is
u = «1>1 V(lI
,- ~- . 'i
(5.149)
The"advantage in modelinf the Imas . function-iS-tbat
instead of solving the N x N 2 matrixcigenvalue pmblem of (5.141~ only two
N x N matrix eigenvalue.prohlems.ot (5. 146} need to be solved, Since an N x N
matrix eigenvalue problem requires O(N) computations, the reduction in dimen-
sionality achieved by the separable model is 0 (N 6YO
(N ) == o (N ) , which is very
J J
significant. Also, the transformation calculations of (5.148) and (5.149) require 2NJ
operations compared to N 4 operations required for '\(1'*1' lb. .
,
Example 5.6
Consider the separable covariance function for a zero mean random field
r(m, n; m', n ') = plm-m'l pln-n'l (5.150)
This gives I7l = R ® R, where R is given by (2.68). The eigenvectors of R are given by
4>. in Example 5.4. Hence .=~®~ and the KL transform matrix is ~T®4tT.
Figure 5.2 shows the basis images of this 8 x 8 two-dimensional KL transform for
p = 0.95.
The KL transform has many desirable properties, which make it optimal in many
. signal processing applications. Some of these properties are discussed here. For
,~implicity~we assume u ~as zero mean and a pO:.,::s.:..:it::..,iv.::.e-:d::.;e=fl::.;·n,iteCllva.Iiance.mJltrix_R.
.
,
t,· -
.
.Decorrelation. The KL transform coefficients {v (k), k = 0, ... ,N - I} are
uncorrelated and have zero mean, that is, I , '
E[v(k)] =0
,
,
1- p2
1 o 1- fl2 o
-:-p :}VRL=
",-. ~D (5.153)
o <,
-p 1
o 1- p2
1
Hence the transformation v = LT u, will cause the sequence v(k) to be uncorrelated.
Comparing with Example 5.4. we see that L 4ofl. Moreover, L is not unitary and the .
diagonal elements of D are not the eigenvalues of R.
Basis restriction mean square error. .Consider the operations in Fig. 5.16.
The vector u is first transformed to v. The elements of ware chosen to be the first m
elements of v and zeros elsewhere. Finally, w is transformed to z, A and Bare
N x N matrices and I,. is a matrix with is along the first m diagonal terms and zeros
elsewhere. Hence
. w(k) = {v(k), Os:kS:m-l
. 0, k'2m
(5.154)
This quantity is called the basis restriction error. It is desired to find the matrices A
and B such that 1m is minimized for each and every value of me [I, N]. This
minimum is achieved by the KL transform of u,
~J:!ID!S.L.
•
The error I,,, in (5.155) is minimum when
\
A = 4l*T, B = e, AB =: I . . (5.156)
•
where the columns of ~ are arranged according to the' decreasing order of the
eigenvalues of R. . c .
To minimize 1,. we first differentiate it with respect to the elements of A and set the
result to zero [see Problem 2.15 for the differentiation rules). This gives
•
I",BT(I - BI",A)*R = 0 (5. 158}
which yields
1
J"'-="N Tr[(I-BI",A)R] (5.159)
I-BA-=O or B=A- 1 •
(5.161)
•
(5.163)
• v
\
",-I
J",~Tr(ImARA"'I) =: 2.: al'Rat (5.164)
k=O .
•
t; -= k=O
2.: aIRa; + L Ak(l - aIan
k-O
,. (5.166) • .
,
and differentiate it with respect to 3j. The result gives a necessary condition
•
• " (5.167)
• •
==Tr(ImA*TRA)
,
=Jm
which, we know from the last property [see (5.164)], is maximized when Ais the KI; .
transform. Since 0'1 = 't\k when A = ~*T, from (5.168)
m-l m-l
L: ~k 2: L: 0'1c, (5.1n)
k=O k=O·
•
Figure 5.17 Unitary transform data transmission. Each element of v is ended
independently.
be the reproduced values ofv and u, respectively. Further, assume that u, Y. y', and
_.
u' are Gaussian. The average distortion
.._" in u is ; .. .-"
1 .
D = NE(u - U·)*T(U - u·)] (5.172)
where •
\
c
\T1, =E[lv(kWl == [ARA*T]k.k (5.176)
depend on the transform A. Therefore, the rate
R ==R(A) (5.177)
also depends on A. For each fixed D, the KL transform achieves the minimum
1¢ . ", ...._ _ M •• . - . _ • - - " ,
rate.
among
, all unitary transforms, that is,
. '-,~
. . .
. , "
The KL transform is •
1 1 . ,
1 . ,-1
Suppose we let 6 be small. say 6 < l-Ipl. Then it is easy to show that
R(lI» < R(I)
This means for a fixed level of distortion, the number of bits required to transmit the
KLT sequence would be less than those required for transmission of the original
•
sequence. •
100.00
10.00
• •
1.00 Sine
0.10
Slant
•
Cosine, Kl
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1S
•
Index k
•
. Exampli., 5.10 (Performance of Transforms on Images)
The mean square error test of the last example can be extended to actual images. .
Consider an N x N image u (m, n) from which its mean is subtracted outto make it zero
mean. The transform coefficient variances are estimated as •
ci'(k., I) = Ellv(k, I)I~ s Iv(k, 1)12
•
.'• •
•
25
Sine
20
.- OFT
-- -.
/I! ,
i
...~ w 15
~
•
•
•
10
. 5
eosine. f< L
,
•
I
Slant
•
FIgure 5.19 Performance of different unitary transforms with respect to basis,
restriction errors (Jnt) versus the number of basis (m) for a stationary Markov
sequence with N = 16, p = 0.95.
,
•
energy in stopband
=
total energy
k,/-O
Figure 5.21 shows an original image and the image obtained after cosine transform
zonal filtering to achieve various sample reduction ratios. Figure 5.22 shows the zonal
filtered images for different transforms at, a 4: 1 sample reduction ratio. Figure 5.23
shows the mean square error versus sample reduction ratio for different transforms.
Again we find the cosine transform to have the best performance.
,
,
.'IN·. """".__
. , .« •
,
•
'.•
(a~Original; (bI4: 1 sample reduction;
•
-
, '
•
,
_a"_'" it;l',¥;'«;'Jiki~',i4, ,g:" ....
(c) 8 : 1 sample reductlqn;
•
, .
td) 16 : 1 sample reduction.
Figure 5.21 Basis restriction zonal filtered images in rosine transform domain.
•
Sec. 5.11 '. The KL Transform 173
,,
• ..r •
;r1:~' •
, 1:<-
,,,-:
;j
J ,
•
- - At~Y'j,
i
<
•
'" '
'1'4
,,. d
.""
,..
"
-'
I,
':J
.. if'
•,
• "I
:,.~
I -,
-OJ'"
\ ~i 4
.'/
,
•. .- >
:,' ,
)
,
I
" • ...
,
•
10
~ 6
w
<t.l
:;;
-e
w 5
-.E.
.!:!
~
0 •
Z 4
OFT
3
Sin. Haar
•
1 Cosl""
Figure 5.23 Performance comparison of
o L- ...L_ _..,....---1_--.,;_---J different transforms with respect to basis
16 8 4 2 restriction zonal filtering for 2S6 x 256
• •
Sample reduction ratio Images.
. .
5.12 A SINUSOIDAL FAMILY OF UNITARY TRANSFORMS
, - ,
This is a class of complete orthonormal sets of eigenvectors generated by the \
I
. parametric family of matrices whose structure is similar to that of R- [see (5.96)],
I-kIll -a k,a
0 ,
-a 1
J = J(k!> kz, k 3) = (5.180)
, •
1 -a • •
I 0
,
k,fX -a l-kza
(5.182)
,
(5.183)
•
The J matrices playa useful role in performance evaluation of the sinusoidal trans-
forms. For example, two sinusoidal transforms can be compared with the KL
transform by comparing corresponding J-matrix distances --
. A(k h k 2,k3) ~ IIJ(kt, k 2 , k 3) - J(p, p, 0)11 2 (5.184)
•
This measure can also explain the close performance of the DCT and the
KLT. Further, it can be shown that the DCT performs better than the sine trans-
form for 0.5 :5 P :5 1 and the sine transform performs better than the cosine for other
values of p. The J matrices are also useful in finding a fast sinusoidal transform
approximation to the KL transform of an arbitrary random sequence whose co-
variance matrix is A. If A commutes with a J matrix, that is, AJ =JA, then they will
have an identical set of eigenvectors. The best fast sinusoidal transform may be
chosen as the one whose corresponding J matrix minimizes the commuting distance
IIAJ - JAII2• Other uses of the J matrices are (1) finding fast algorithms for inversion
of banded Toeplitz matrices, (2) efficient calculation of transform coefficient vari-
• ances, which are needed in transform domain processing algorithms, and (3) estab-
lishing certain useful asymptotic properties of these transforms. For details see [34].
•
•
where 'lit and 1;1) are N x rand M x r matrices whose mth columns are the vectors
'!1m and cf>m, respectively, and A 1/2 is an r x r diagonal matrix, defined as
~~,~ 0
~~
,~
. 0 '~ (5.189) .
Equation (5.188) is called the spectral representation, the outer product expansion,
or the singular value decomposition (SYD) of U. The nonzero eigenvalues (of
UTU), Am' are also called the singular values of U. If r < M, then the image contain-
ing NM samples can be represented by (M + N)r samples of the vectors
{A~4 IJIm ,A::'4 cf>m ; m =: 1, ... ,r}. .
Since 'lit and 1;1) have orthogonal columns, from (5.187) the SVD transform of
the image U is defined as ' .
(5.190)
which is a separable transform that diagonalizes the given image.The proof of
(5.188) is outlined in Problem 5.31. "
1. Once c1Jm ,m == 1, ... , r are known, the eigenvectors IJIm can be determined as
.:i 1 .
I/Im ==. ;:-UclJm, m = 1, ..• ,r (5.191)
vx, ,
•
It can be shown that I/Im are orthonormal eigenvectors of UU if c1Jm are the T
• '(5.192)
Sec. 5.13· Outer Product Expansion and Singular Value Decomposition' 177
• 3. The image U, generatedby the partial sum
•
U. ~ 2: 0:: Ilfm $~ , .: k s r (5.193)
reduces to
(5.195)
Let L ~ NM. Note that we can always write a two-dimensional unitary transform
representation as ail outer product expansion in an L-dimensional,space, namely,
L
U=2:wI8Ibl (5.196)
1= I
where WI are scalars and III and b l are sequences of orthogonal basis vectors of
dimensions N x 1 and M x 1, respectively. The least squares error between U and
any partial sum .
(5.197)
is minimized for any k E [1, L] when the above expansion coincides with (5.193),
. -
that IS, when Uk = Uk.
This means the energy concentrated in the transform coefficients w{,l = 1, ... , k
is maximized by the Sl/D transform for the given image. Recall that the KL trans-
form, maximizes the average energy in a given number of transform coefficients, the
average being taken over the ensemble for which the autocorrelation function is
defined. Hence, on an image-to-image basis, the SVD. transform will concentrate
more energy in the same number of coefficients. But the SVD has to be calculated
for each image. On the other hand the KL transform needs to be calculated only
once for the whole image ensemble. Therefore, while one may be able to find a
. reasonable fast transform approximation of the KL transform, no such fast trans-
form substitute for the SVD is expected to exist.
•
Although applicable in image restoration and image data compression prob-
lems, the usefulness of SVD in such image processing problems is severely limited
because of large computational effort required for calculating the eigenvalues and
eigenvectors of large image matrices. However, the SVD is a fundamental result in
matrix theory that is useful in finding the generalized inverse of singular matrices
and in the analysis of several image processing problems.
Example 5.11 •
Let 1 2
U= 2 1
1 3·
The eigenvalues ofUTU are found to be Al == 18.06, A2 == 1.94, which give r = 2, and the
SVD transform of U is
A ' 12 - [4.25 0 ] _
- 0 1.39
DFf/llnitaIy DFf Fast transform, most useful in digital signal processing, convolution,
digital filtering, analysis of circulant and Toeplitz systems. Requires
complex arithmetic. Has very good energy compaction for images.
Cosine' Fast transform, requires real operations, near optimal substitute for
the KL transform of highly correlated images. Useful in designing
transform coders and Wiener filters for images. Has excellent
energy compaction for images.
Sine About twice as fast as the fast cosine transform. symmetric, requires
•
real operations; yields fast KL transform algorithm which yields
recursive block processing algorithms, for coding, filtering, and so
on; useful in estimating performance bounds of many image
processing problems. Energy compaction for images is very good.
Hadamard Faster than sinusoidal transforms, since no multiplications are
. required; useful in digital hardware implementations of image
processing algorithms. Easy to simulate but difficult to analyze. •
Haar Very fast transform. Useful in feature extracton, image coding, and
image analysis problems. Energy compaction is fair.
Slant Fast transform. Has "image-like basis"; useful in image coding. Has \
,
very good energy compaction for images.
Karhunen-Loeve Is optimal in many ways; has no fast algorithm; useful in performance \
evaluation and for finding performance bounds. Useful for small
size vectors e.g., color multispectral or other feature vectors. Has
the best energy compaction in tbe mean square sense over an
ensemble.
•
FastKL Useful for designing fast, recursive-block processing techniques,
including adaptive techniques. Its performance is better than
independent block-by-block processing techniques. •
Sinusoidal transforms Many members have fast implementation, Useful in finding practical
substitutes for the I(L transform, analysis of Toeplitz systems, •
\
, mathematical modeling of signals. Energy compaction for the
optimum-fast transform is excellent.
SVD transform Best energy-packing efficiency [or any given image. Varies drastically
from image to lmage.has no fast algorithm or a reasonable fast
• transform substitute; useful in design of separable FIR filters,
finding least squares and minimum norm solutions of linear
equations. finding rank of large matrices, and so on. Potential •
Sec. 5.13 Outer Product Expansion and Singufar Value Decomposition 179
•
From above +, is obtained via (5.191) to yield
1.120 1.94
V, = ~ +1 4Jf = 0.935 1.62
1.549 2.70
as the best least squares rank-1 approximation of V. Let us compare this with the two
dimensional cosine transform U, which is given by
v'Z v'Z v'Z 1 2 [1 1] 1 10v'Z -2v2
V"C;lUC~=.-l- v3 0 -v3 2 1 1 -1 = -v3 v3
v12 1 -2 1 1 ' 3 ' v12 -1 -5
It is easy to see that };2: v Z (k, I) = A, + Az. The energy concentrated in the K
k,J
samples ofSVD, k;_tAm, K = 1, 2, is greater than the energyconcentrated in any K
samples of the cosine transform coefficients (showl). . '
•
5.14 SUMMARY
In this chapter we have studied the theory of unitary transforms and their proper-
ties. Several unitary tranforms, OFT, cosine, sine, Hadamard, Haar, Slant, KL,
sinusoidal family, fast KL, and SVD, .were discussed. Table 5,3 summarizes the
various transforms and their applications.
PROBLEMS,'
•
5.1 For given P, Q show that the error o-i of (5.8) is minimized when the series coefficients
v(k, l) Me given by (5.3). Also show that the basis images must form a complete set for
0"; to be Zero for P .. Q .. N.
5.2 (Fast transforms and Kronecker separability) From (5.23) we see that the number of '
operations in implementing the matrix-vector product is reduced from 0 (N<) to
o (N ) if A is a Kronecker product. Apply this idea inductively to sbow that if .A is
3
M xMand
•
,A = A I@A2@ ... @Am
where Ate is n. x n., M = Uk= I n., then the transformation of (5.23) can be imple-
mented in O(M "i::. t nk), which equals nM log. M if n. = n. Many fast algorithms for
unitary matrices can be given this interpretation which was suggested by Good [9].
Transforms possessing this property are sometimes called Good transforms.
5.3 For the 2 x 2 transform A and the image V
t v3 1 U= 2 3
A=z -1 v3' 1 2
calculate tbe.transfonped image V and the ~ges.
-
180 Image Transforms Chap. $,
5.4 Consider the vector x and an orthogonal transform A
x= Xo A= co~a sin a
XI' , -Sin e cos (I
Let 11.0 and at denote the columns of AT (that is, the basis vectors of A). 'The trans-
, formation' y = Ax' can be written as Yo =;J;x,y, = aT x, ,Represent the vector x in
Cartesian coordinates on a plane. Show that the transform A is a rotation of the
coordinates by a and Yo and y, are the projections of:lt in the new coordinate system
c (see Fig. PH).
5.5 Prove that the magnitude of determinantof a unitary transform is unity. Also show
that all the eigenvalues of a unitary matrix have unity magnitude.
Show that the entropy of an N x 1 Gaussian random vector u with meanp, and
covariance R, given by
1""'H-(-U)-=-!:!';";2-10-g,-(-2'll'-e-IRu--11I~ €
,
1 2 3 4
2 '1 4 3.
gr=; 1--+---)
3 4 1 2
43 2 1
0 -2
N
N - 1
\
0 ,
,
R
•
~
"
-2
N
AI-I
N
•
-2
. '
Figure PS.12
•
where IJo is a constant, remains the same only if the vector 1 ~ (1, 1, ... , IV is an
eigenvector of the covariance matrix of u, Whi"h of the fast transforms discussed in the
text satisfy this property?
M3 If u, and U2 are random vectors whose autocorrelation matrices commute, then show
that they have a common KL transform, Hence, show that the KL transforms for
autocorrelation matrices R, R- 1, and f(R), where f(·) is an arbitrary function, are
identical. What are the corresponding eigenvalues?
5.24* The autocorrelation array of a 4 x 1 zero nlean vector u is given by {O.95 Im- nl,
Osm, n s3}.
,
a. What is the KL transform of u?
• b. Compare the basis vectors of the KL transform with the basis vectors of the 4 x 4
unitary DFT, DCT, DST, Hadamard, Haar, and Slant transforms.
Co Compare the performance of the various transforms by plotting .
the basis restriction
,
• error Jm versus m.
5.25* TIle autocorrelation function of a zero mean random field is given by (5.150), where
. p = 0.95. A 16 x 16 segment of this random field is unitarily transformed. .
•
a. What is the maximum energy concentrated in 16, 32, 64,' and 128 transform coeffi-
cients for each of the seven transforms, KL, cosine, sine, unitary DFT, Hadamard,
Haar, and Slant? .
•
•
• b. Compare the performance of these transforms for this random field by plotting
the mean square error for sample reduction ratios of 2, 4, 8, and 16. (Hint: Use
Table 5.2.) • •
5.26 (Threshold representation) Referring to Fig. 5.16, where u(n) ill a Gaussian random
sequence, the quantity
IN-I . IN-l ",-I •
5.27 (Minimum entropy property of the KL transform) (30] Define an entropy in the'
A-transform domain as 1
N-'
H[A] "" - 2: 0"1 log. 0"1
.~O
where 0"1 are the variances ofthe transformed variables v (k). Show that among all the
unitary transforms the KL transform minimizes this entropy, that is"H[eJI'1 s H[A].
5.28 a. Write the N x N covariance matrix R defined in (2.68) as
. . 132 R " = J(k k 2 , k3) - A,J
"
where AJ is a sparse N x N matrix with nonzero terms at the four comers. Show
that the above relation yields
R = 13 2
r ' + 13 r ' AJr ' + r ' (AR)r '
2
where AR ~ AJRAJ is also a sparse matrix, which has at most four (comer) non-
zero terms. If eJI diagonalizes J, then show that the variances of the transform
coefficients are given by .
! .
••
•
• . P5.28-3
Osk<N-l
where 1I.. = 1 - 2a cos(k + 1)-rr/(N + 1), and •
b. Using the formulas P5.28·2-P5.28-4 and (5.120), calculate the fraction of energy
packed in N!2 transform coefficients arranged in decreasing order by the cosine,
sine, unitary DFT, and Hadamard transforms for.N '" 4, 16, 64, 256, 1024, and
.4096 for a stationary Markov sequence whose autocorrelation matrix is given by
R={plm .. ni},p"'O.95. . ,
S.Z!i a. For an arbitrary real stationary sequence, its autocorrelation matrix, R ~ {r(m - n)},
is Toeplitz. Show that A-transform coefficient variances denoted by aHA), can be
obtained in O(N logN) operations via the formulas
1 N-l
a~(F) = Nn-~+l (N -lnj)r(n)WN nk
, . F = unitary DFt'
u~AA) '" 2: 2: 2:
Itt ,. IH;'
L: a(k, m)a* (k, m ')r(m -
n'
m',n - n ')a(t, n)a* (t, n ')
2
Show that CT1AA) can be evaluated in O(N log N) operations, when A is the FFT,
DST, or DCT. '
5.30 Compare the maximum energy packed in k SVD transform coefficients for k = 1,2, of
the 2 x 4 image .
u=G~~:)
with that packed by the cosine, unitary DFT, and Hadamard transforms.
5.31 (ProofofSVD representation) Define cit", such that m '" 0 for m = r + 1, •.• ,M SO\U+
that the set cit.. , 1:s m s M is complete and orthonormal. Substituting fur from .m
•• (5.191) in (5.188), obtain the following result: .. .
r r . '
, ,
1. H. C. Andrews. Computer Techniques in [mage Processing. New York: Academic Press,
1970, Chapters 5, 6. ,
. 2. H. C. Andrews. "Two Dimensional Transforms." in Topics in Applied Physics: Picture
Processing and Digital Filtering, vol. 6, T. S. Huang (ed)., New York: Springer Verlag,
1975. '" ,
.
3. N. Ahmed and K.R. Rao. Ortlwgonal Tramforms for Digital Signal ProCess/nt. New
York: Springer Verlag, 1975.
,
Section 5.3
9. I. J. Good. "The Interaction Algorithm and Practical Fourier Analysis." J. Royal Stat.
Soc. (London) B20 (1958): 361..
10. J. W. Cooley and J. W. Tukey, "An Algorithm (or the Machine Calculation of Complex
Fourier Series," Math. Comput. 19,90 (April 1965): 297-301.
11. IEEE Trans. Audio and Electroacoustics. Special Issue, on the Fast Fourier Transform
AU-IS (1967).
.12. G. D. Bergland. "A Guided Tour of the Fast Fourier Transform." IEEE Spectrum
6 (July 1969): 41-52.
. 13. E. O. Brigham. The Fast Fourier Transform. Englewood Cliffs, N.J.: Prentice-Hall,
19~. •
14. A. K. Jain. "Fast Inversion of Banded Toeplitz Matrices Via Circular Decomposition."
IEEE Trans. ASSP ASSP-26, no. 2 (April 1978): 121-126.
15. N. Ahmed, T. Natarajan, and K. R. Rao. "Discrete Cosine Transform." IEEE Trans. on
Computers (correspondence) C-23 (January 1974): 90-93.
16. A. K. Jain. "A Fast Karhunen Loeve Transform for a Class of Random Processes."
IEEE Trans. Communications, Vol. COM-24, pp. 1023-1029, Sept. 1976.
17. A. K. Jain. "Some New Techniques in Image Processing," Proc. Symposium on Current
Mathematical Problems in Image Science, Monterey, California, November 10-12, 1976.
18. W. H. Chen, C. H. Smith, and S. C. Fralick. "A Fast Computational Algorithm for tbe
Discrete Cosine Transform." IEEE Trans. Commun. COM-25 (September 1977):
1004-1009.
• •
19. M. J. Narasimha and A. M. Peterson. "On the Computation of the Discrete COiTOe
Transform. IEEE Trans: Commun., COM-26, no. 6 (June 1978): 934-936. •
Section 5.11
For minimum mean square variance distribution and entropy properties We follow,
primarily, [30] and [31]. Some other properties of the KL transform are discussed
•
·m: . .
•
Chap. 5 Bibliogtaphy 187
•
•
34. A. K. Jain. "A Sinusoidal family of Unitary Transforms." IEEE Trans. Pauern Anal.
• •
Section 5.13
35. G. E. Forsythe, P. Henrici. "The Cyclic Jacobi Method' for Computing the Principal
Values of a Complex Matrix." Trans. Amer. Math. Soc. 94 (1960): 1-23. . .
36. G. H. Golub and C. Reinsch. "Singular Value Decomposition and Least Squares Solu-
tions." Numer. Math. 14 (1970): 403-420.
37. S. Treitel and J. L. Shanks. "The Design of Multistage Separable Planar Filters." IEEE
Trans. Geoscience Elec. Ge-9 (January 1971): 10-27.
•
•
•
Image Representation
by Stochastic Models
6.1 INTRODUCTION
Covariance Models
• 189
• Stochastic models lJsed.in image processing
r I .: . . . 'I -
Covarjance models One-dlmeosional lt-Dl modelS Two'dimensionaI12-D) model.
- -~ .~---- .
e(nl + •
u I nl
+ ,,
•
+ • uln) e(nl
p
• I
,-j
l: =a(klr'
p
. u(n) = L: a(k)u(n - k) + ein], "In (6.3a)
k=1
PropClrties of AR Models
The quantity
. p
is the best linear mean square predictor of u(n) based on all its past but depends
only on the previous p samples. For Gaussian sequences this means a pth-order AR
sequence is a Markov-p process [see eq. (2.66b)). Thus (6.3a) can be written as
u(n) = Ii (n) + 8(71) (6.5)
which says the sample at 71 is the sum of its minimum variance, causal, prediction
estimate plus the prediction error e(n), which is also called the innovations sequence.
Because of this property an . AR model is sometimes called a causal minimum .
For this reason, Ap(z) is also called the whitening filter for u(n). The proofis
considered in Problem 6.1. .
Except for possible zeros at z = 0, the transfer function and the SDF of an AR
• _ R2 .
S(z) J::: z :;:: el "' , (6.9)
- A p (z)A p (z -I) ,
Because r, (n) = f:)2 0(n) gives S, (z) = a
2, this formula follows directly by applying
(6.1).
For sequences with mean j.l., the AR model can be modified as
p
x(n) = L
k-'J
a(k)x(n - k) + 8(n)
(6.lOa)
,
u(h) = x(n) + fJ.
where the properties of 8(n) are same as before. This representation can also be
written as
p p
u(n) = L a(k)u(n - k) + 8(n) + fJ. 1- L a(k) (6. 1Gb)
k=l k=J
r(n) - L
k=J
a(k)r(n - k) = f:) 23(n), Vn ;::: ° (6.12)
where r(n) ~ E[u(n)u(O)] is the covariance function of u(n). This result is im-
portant for identification of the AR model parameters a (k), 132 from a given set of
covariances {r(n), -p :s;, n :s;,p}. In fact, a pth-order AR model can be uniquely
determined by solving (6.12) for n = 0, ... ,po In matrix notation, this is equivalent
to solving the following normal equations:
,
Ra=r (6. 13a)
• (6. 13b)
, •
where R is the i x p Toeplitz matrix
· reO) r(l) ... rep -1)
R~ r(l) . ••
•
•
- .• r(l)
(6.13c)
• • . '
rep -1) ... r(l) reO)
•
. <T2[~~J[:ml=(T2[~2]
2
which gives a(1) = p, a(2) = 0, and 13 "" (T2 (1- p2).The corresponding representation
for a scan line of the image, having pixel mean of jJ., is a first-order AR model
.
x(n) = px(n -1) +8(n), r. (n) = (T2 (1- p2)8(n)
a(n) = x(n) + f.l. (6.14)
withA(z) = 1- pz-" S. = (T2(1- p2), and S(z) =(12(1- p2)/[(1 - pz-l)(I- pz)J.
rep
,
+ n):=
.
2:
k l
=;
a(k)r(p + n - k), "In <!: 1
(6.15)
.. r(-n):=r(n), "In
This extension has the property that among all possible positive definite extensions
of {r(n)}, for In I> p, it maximizes the entropy
•
H~2,;
1 f"
_"If 10gS(w)dw (6.16)
t
. <. where Sew) is the Fourier transform of {r(n),'v'n}. The AR model SDF Sew), which .
can be evaluated from the knowledge of a(n) via (6.9), is also called the maximum
entropy spectrum of {r(n), Inl :sp}. This result gives a method of estimating the
power spectrum of a partially observed signal. One would start with an estimate of
the p + 1 covariances, {r(n), O:;:;:n :S:p}, calculate the AR model parameters [32,
.a (k), k= 1, ... .p, and finally evaluate (6.9). This algorithm is also useful in certain
image restoration problems [see Section 8.14 and Problem 8.26].
• • •
An,o!z) AR model
,
exploited in data compression of images and other signals. For example, a digitized
AR sequence u(n) represented by B bits/sample is completely equivalent to the
digital sequence i (n) ~ u (n) ~ "if (n), where u' (n) is the quantized value of u (n).
The quantity E (n) represents the unpredictable component of u(n), and its entropy
is generally much less than that of u (n). Therefore, it can be encoded by many fewer,
bits per sample than B. AR models have also been found very useful in representa-
tion and linear predictive coding (LPC) of speech signals [7}.
Another useful application of AR models is in semirecursive representation of
images. Each image column Un, n = 0, 1,2, ... , is first transformed by a unitary
matrix, and each row of the resulting image is represented by an independent AR
model. Thus if V n ~ 'lTu., where 'IT is a unitary transform, then the sequence
{lin (k), n = 0, 1, 2 ...} is represented by an AR model for each k (Figure 6.3), as
p
ll.(k) = S ai(k)vn_i(k) + e.(k), 'rIn,k =O,I, ... ,N-l
i= 1
(6.17)
The optimal choice of 'IT is the KL transform of the ensemble of all the image
columns so that the elements of v, are uncorrelated. In practice, the value of p = 1
or 2 is sufficient, and fast transforms such as the cosine or the sine transform are
good substitutes for the KL transform. In Section 6.9 we will see that certain,
so-called sernicausal models also yield this type of representation. Such models are
useful in filtering and data compression of images [see Sections 8.12 and 11.6].
•
Moving Average IMA) Representations
A random sequence lI(n) is called a moving average (MA) precess of order q when
it can be written as a weighted running average of uncorrelated random variables
,
q. •
•
u(n) = 2: b(k)e(n - k) (6.18)
k=O •
,
where e(n) is a zero mean white noise process of variance (32 (Fig. 6.4). The SDF of
this MA is given by .
•
(6.19a)
q
B q (z) = 2: b(k)z-k • (6.19b)
k=O
•
From the preceding relations it is easy to deduce that the covariance sequence of a
qth-order MA is zero outside the interval r-q, q]. In general, any covariance
sequence that is zero outside the interval [-q, q Jcan be generated by a qth-order
MA filter B q (z), Note that Bq(z) is an FIR filter, which means MA representations
are all-zero models.
Example (i.2
Consider the first-order MA process
2
u(n) "''l;(n) - ns(n -1),. E[,,(n),,(m)] == 13 l\(m - n)
• Then B, (z) = 1- nz·" S(z) = {32[1 + a - «(z + Z-l)J. This shows the covariance s.e-· .
2
quence of u(n) is reO) = 13'(1 + a'), r( ±1) = -a(3" r(n) = 0, Inl .>, 1.
I !, "
Autoregressive Moving Average (ARMAI Representations
•
An AR model whose input is an MA sequence yields a representation of the type
(Fig. 6.5)
p q •
where £(n) is a zero mean white sequence •of variance 132 • This is called an ARMA •
representation of order (p, q). Its transfer function and the SDF are given by
p x 1 vector sequence of independent random variables and TIn is the additive white
noise. The matrices An , B, , and en are of appropriate dimensions and T/n , ell satisfy
•
depends on 'k, the position of the raster, as well as the displacement variable l. Such
a covariance function can only yield a rime-varying. realization, which would
increase thecomplexity of associated processing algorithms. A practical alternative
is to replacers (k, I) by its average over the scanner locations [9J, that is, by
N - 1
Given is (I), we can find a suitable order AR realization using (6.12) or the results of
the following two sections. Another alternative is to determine a so-called cycle-
stationary state variable model, which requires a periodic initialization of the states
for the scanning process. A vector scanning model, which is Markov and time
invariant, can be obtained from the cyclostationary model [10J. State variable
scanning models have also been generalized to two dimensions [11,12]. The causal
models considered in Section 6.6 are examples of these. .
specifying the phase of H (w), because its magnitude can be calculated within a
constant from Sew). .
Rational SDFs
For such SDFs it is always possible to find a causal and stable filter H(z). The
method is based on a fundamental result in algebra, which states that any'
polynomial Pn (x) of degree n has exactly n roots, so that it can be reduced to a
product of first-order polynomials, that is,
n
P; (x) = TI (aix -131) (6.31)
i:- 1 •
jw
For a proper rational S(z), which is strictly positive and bounded for z = e , there
will be no roots (poles or zeros) on the unit circle. Since S(z) = S(Z-I), for every
root inside the unit circle there is a root outside the unit circle. Hence, if H(z) is
chosen so that it is causal and all the roots of AI' (z) lie inside the unit circle, then
(6.27) will be satisfied, and we will have a causal and stable realization. Moreover, if
Bq (a) is chosen to be causal and such that its roots also lie inside the unit circle, then
the inverse filter 1/8(z), which is A p(z)IBq(z), is also causal and stable. A filter that
is causal and stable and has a causal, stable inverse is called a minimum-phase filter.
Example 6.3
Let
(6.32)
-,
The roots of the numerator are ZI = 0.25 and Z2 = 4 and those of the denominator are
z, = 0:5 and z, = 2. Note the roots occur in reciprocal pairs. Now we can write
S(z) = ](1 - O.25z -1)(1 - 0.25t) •
(1-: O.5z -')(1 - OSz)
. .
. Comparing this with (6.32), we obtain K = 2. Hence a filter with H(z) = (1- 0.25z- 1)1
(1-0.5z -I) whose input is zero mean white noise, with variance of 2, will be a mini-
mum phase realization of S(z). The representation of this system will be
" ·u(n}= O:5u(n -1) + e(n) - O.25e(n -1)
• E[e(n)] '" 0, E[e(n)e(m)] = 28(n - m) • (6.33)
This is an ARMA model of order (1, 1).
It is not possible to find finite-order ARMA realizations when S(z) is not rational.
The spectral factors are irrational, that is, they have infinite Laurent series. In
practice, suitable approximations are made to obtain finite-order models. There is a
subtle difference between the terms realization and modeling that should be pointed
out here. Realization refers to an exact matching of the SDF or the covariances of
the model output to the given quantities. Modeling refers to an approximation of
the realization such that the match is close or as close as we wish.
One method of finding minimum phase realizations when S (z) is not rational
is by the Wiener-Doob spectral decomposition technique [5, 6]. This is discussed for
2-D case in Section 6.8. The method can be easily adapted to 1-D signals (see
Problem 6.6). .
I
6.4 AR MODELS, SPECTRAL FACTORIZATION,
AND LEVINSON ALGORITHM
The theory ofAR models offers an attractive method of approximating a given SDF
arbitrarily closely by a finite order AR spectrum. Specifically, (6.13a) arid (6.13b)
can be solved for a sufficiently large p such that the SDF Sp (z) ~ /3},/[A p (z)Ap(Z-I»)
is as close to the given S (z),.z == exp(jw), as we wish under some mild restrictions on
S(z) [1]. This gives the spectral factorization of S(z) as p -> 00. If S(z) happens to
have the rational form of (6.9), then a(n) wili turn out to be zero for all n > p, An
efficient method of solving (6.13a) and (6. 13b) is given by the following algorithm;
where n = 1,2,,, .. .p. The AR model coefficients are given by a (k) = Qp (k),
13 = j3~" One advantage ofthis algorithm is that (6.13a) can now be solved in O(p2)
2
quences can be determined uniquely from these recursions. This property is useful
in developing the stability tests for 2-D systems. The Levinson algorithm is also
useful in modeling 2-D random fields, where the SDFs rarely have rational factors.
Example 6.4
The result of AR modeling problem discussed in Example 6.1 can also be obtained by
applying the Levinson recursions for p = 2. Thus reO) = 0'., r(l) = a p, r(2) = 0'2 p.
2
and we get
0"2 p
Pi = 2 = p, 13r= 17 2(1 - p2),
,
0"
p, =-\13 I
[a 2 pZ _ paZp]
'
=0 •
This gives /.)2:= 13! = 131 , a(1):= Q,(l) = a, (1) = p, and a(2) = az (2) = 0, which leads to
(6.14). Since Ip,l and Ilhl are less than I, this AR model is stable.
"
No
I
•,
,, •
Yes ,
•
\
..,." •
Stop
•
where a(k) are determined to minimize the variance of the prediction error
u(n) - u (n). The noncausal MVR of a random sequence. u(n) is then defined as
where v(n) is the noncausal prediction error sequence. Figure 6.7a shows the
noncausal MVR system representation. The sequence u(n) is the output of a non-
causal system whose input is the noncausal prediction error sequence v(n). The
transfer function of this system is l/A (z), where A (z) !I. 1 - 2:"",,0 Q'.(n)z -n is called
the noncausal MVR prediction error filter. The filter coefficients a(n) can be deter-
mined according to the following theorem.
Theorem 6.1 (Noncausal MVR theorem): Let u(n) be a zero mean, sta-
j
tionary random sequence whose SDF is S(z). If lIS(e ",) has the Fourier series
00
1 .~ 2: -: (n)e-j",n
• S(e'"') n~-ro
r+(n) =-If''
2'1l' _.".
S-l(ejw)ei""'dw - (6.37) .
-r+ (n)
a(n) = r" (0)
•
2A ." . 1
13 := E{[v(nW} = r+ (0) (6.38)
Moreover, the covariances r, (n) and the SDP SV (z) of the noncausal prediction
error v(n) are given by .
rv(n):= -13 2 Q'. (n), <x(O) ~-1
•
•
The proof is developed in Problem 6.10.
. .
. ~
Aizl = 1 - ~ ",(n)r"
fl"'·-""
n-O
Remarks
1. The noncausal prediction error sequence is not white. This follows from
(6.39). Also, since r v (-n) = r, (n), the filter coefficient sequence o:.(n) is ~wen,.
that is, 01.( -n) = OI.(n), which implies A (z -I) = A (z),
2. For the linear noncausal system of Fig. 6.7a, the output SDF is given by
S(z) = S; (z)/(A (z)A (Z-I)J. Using (6.39) and the fact that A (z) = A (Z-l), this
becomes •
.
. (6.40)
.
3. Figure 6.7b shows the algorithm for realization of noncausal MVR filter
coefficients. Eq. (6.40) and Fig. 6.7b show spectral factorization of S(z) is not
required for realizing noncausal MVRs. To obtain a finite-order model, the
Fourier series coefficients r+ (n) should be truncated to a sufficient number of
terms, say, p. Alternatively, we can find the optimum pth-order minimum
variance noncausal prediction error filter. For sufficiently large p, these meth-
ods yield finite-order, stable, noncausal MVR-models while matching the
. given
." covariances to a desired accuracy.
. '.
(6-41)
(l"'" P .
. 1 +l
•
•
,,'-'- -~,
which means •
• •
••
1
0'(0) = -1. uO) '" 0'.(-1) = Q, 'r+(O)=-
(3'
The resulting noncausal :-'1 VR is
u(n) = a[u(n - 1) + u(n + l)J + v(ll)
•
S,(z) = ~A(z)
•
,
where v(n) is a pth-order MA with zero mean and variance 13; (see Problem 6.11).
•
•
.A Fast KL Transform [13]
be obtained for higher order AR sequences also (see Problem 6.12). In general, the
KL transform of UO is determined by the eigenvectors of a banded Toeplitz matrix.
whose eigenvectors can be approximated by an appropriate transform from the
sinusoidal family of orthogonal transforms discussed
.
in Chapter 5. .
u(l }
- •
•
u c= • ,
•
•
u{N}
, •, ,
•
, , +
. UO
OlU(O )
- +
-
o
r •
The noncausal MVRs are also useful in finding the optimum interpolators for
random sequences. For example, suppose a line of an image, represented by :l
first-order AR model, issubsampled so that N samples are missing between given
samples. Then the best mean square estimate of a missing sample u(n) is
U (n) ~ E[u(n)iu(O), u(N + 1)], which is precisely u" (n), that is,
, ,
When the interpixel correlation p-l, it can be shown that u(n) is the straight-line
interpolator,between u(O) and u(N + 1), that is,
,
For values of p near 1, the interpolation formula of (6.53) becomes a cubic poly-
nomial in n/(N + 1) (Problem 6.13). ' '
The notion of causality does not extend naturally to two or higher dimensions.
Line-by-line processing techniques that utilize the simple I-D algorithms do not
exploit the 2-D structure and the interline dependence. Since causality has no
intrinsic importance in two dimensions, it is natural to consider other data structures
to characterize 2-D models. There are three canonical forms, namely, causal, semi-
causal, and noncausal, that we shall' consider here in the framework of linear
prediction. "
These three types of stochastic models have application in many image process-
ing problems. For example, causal models yieldo recursive algorithms in ' data' com-
pression of images by the differential pulse code modulation (DPCM) technique and
in recursive filtering of images. "
Sernicausal models are causal in one dimension and noncausal in the other and
lead themselves naturally to hybrid algorithms, which are recursive in one dimen-
sion and unitary transform based (nonrecursive) in the other. The unitary transform
decorrelates the data in the noncausal dimension, setting up the causal dimension
for processing by I-D techniques. Such techniques combine the advantages of high
performance of transform-based methods and ease of implementation of 1-D
, '
algorithms.
Noncausal models give rise to transform-based algorithms. For example, the
notion of fast KL transform discussed in Section 6.5 arises from the noneausal MVR,
of Markov sequences. Many spatial image processing operators. are noncaussl-«
that is, finite impulse response deblurring filters and gradient masks, The coeffi-
cients of such filters or masks, generally' derived by intuitive reasoning, can be
obtained more accurately and quite rigorously by invoking noncausal prediction
concepts discussed here. '
,
Causal Prediction
A causal predictor is a function of only the elements that arrive before it. Thus the
causal prediction region is (Fig. 6.9a) .
SI ={l ;;=:l~Vk} U {I =O,k;;=: I} (6.56)
This definition of causality includes the special case of single-quadrant causal pre-
dictors.
This is called a strongly causal predictor. In signal processing literature the term
causal is sometimes used for strongly causal models only, and (6.56) is also called
the nonsymmetric half-plane (NSHP) model [22].
e r
(l'
I
••
.... rm n
I
~ , ,
$, $, $3
(a) Causal (b) Sernicausal (c) Noncausal
• • •
Figure 6.9 Three canonical prediction regions Sx and the corresponding finite
• •
. prediction windows W.. x = 1, 2, 3: . '
Semicausal Prediction
Noncausal Prediction
We also define
A •
w,=Wx U (0,0), x=I,2,3 (6.61b)
•
Example 6.5
,
The following are examples of causal, semicausal, and noncausa! nredictors.
Given a prediction region for forming the estimate u(m, n), the prediction coeffi-
cients a(m, n) can be determined using the minimum variance criterion. This re-
quires that the variance of prediction error be minimized, that is,
~Z~minE[eZ(m,n)], e(m,n)~u(m,n)-u(m,n) (6.62)
The orthogonality condition associated with this minimum variance pred~ction is
E[e(nt, n)u(m - k, n '-l)J :=: 0, (k, l) E S., 'r/(m, n) (6.63)
•
(6.64)
(6.65)
from which we get.
,
2
, rtk, l) - L:L:.a(i,j)r(k - i, [- j) = fS 8(k, l), (k, l) E S., x = 1,2,3 (6.66)
", (i./) E Wx
The solution of the above simultaneous equations gives the predictor coefficients
at], j) and the prediction error variance (32. Using the symmetry property
rtk, I) = r( k, -I), it can be deduced from (6.66) that
-r .
a (-i, 0) = a (i, 0) for semicausal predictors
(6.67)
a(-i, -j) = a(i,j) for noneausal predictors
A random field characterized by this MVR must satisfy (6.63) and (6.64). Multi-
plying both sides of (6.70) by ll(m, n), taking expectations arid using (6.63), we
obtain
. •
E[u(m, n)e(m, n)] = 132 . '
. (6.71)
Using the preceding results we find
r.Ck, I) ~ E[e(m, n)ll(m - k, n-l)]
= E{e(m, n)[u(m - k, n -l) - u(m - k, n -I)]} (6.72)
,
= 2
13 3(k, l) - 2:2:. a(i, ne(k + i, I + j)
(i.n f w.
With 0(0, 0) ~ -1 and using (6.67), the covariance function, of the prediction error
eim, n) is obtained as i
r.(k, I) =
-13 3(1)
2
2: a(i, O)o(k+ 0, .. 'riCk, I) for semicausal MVRs
!"'" -p
p q .
-1>2.2: .2: a(i,j)8(k - ,
i)o(l - j), 'rI(k, I) for noncausal MVRs
j=-pj=-q
(6.73)
The filter represented by the two-dimensional polynomial
.
causal MVRs
P
-13 2: oem, o)z}m = 13 A (Z1> (0),
2 2
semicausal, MVRs
m== -p
noncausal MVRs
•
(6.76)
causal MVRs
semicausal
.. MVRs (6.77)
_,-!=-f32..:.A::...(~Z-"b:...;Z7Z),---.,.,.= /32 , noncausal
1
A (ZI zz)A (ZI I, Z2 ) A (zj, Zi) . MVRs
•
Thus the SDFs of all MVRs are determined completely by their prediction error
filters A (Z1> Z2) and the prediction error variances (32: From (6.73), we note the
causal MVRs are also white noise-driven models, just like the 1-D AR models. The
semicausal MVRs are driven by random fields, which are white in the causal
dimension and moving averages in the noncausal dimension. The noncausal MVRs
are driven by 2-D moving average fields. .
Remarks
causal model. S. (z" Z2) = pz, which means e(m, n) is a white noise field.
This gives
• Semicausal model. 2
S. (z" Zl) = 13 [1 - a, zi ' - 112Z,1 = /32 A (.1" 00). Because
the SDF S.(ZhZ2) must equal S, (2:'''z,'). we must have 0, = 02 and
•
2
S ( ) _ 1 3 (1 - o, zi ' - a, z,) , .
• Z"Zz -'( 1
- 0, z, ' - a, Zl - a,z.-')( 1 - a, Zl - 0, Zl-- 'O,Z2
)'
Clearly e(m, n) is a moving average in the m (noncausal) dimension and is white in the
n (causal) dimension.
Noneausal model.
2
S. (z"zz) = 13 [1 - al zl' - a2Z' - a,z,' - a4z2) = ~z A (z" Z2) •
Example 6.8
Consider the linear system
A (z" Z2) U (z" Z2) = Ftz«, Z2)
where A (z" Z2) is a causal, semicausal, or noncausal prediction filter discussed in
Example 6.7 and F(Z"Z2) represents the Z-transform of anarbitrary input [(m, n).
For the causal model, we can write the difference equation
u(m, n) - al u(m -I,n):' a2u(m, n -1) - a,u(m - I,n - 1) = [(m, n)
This equation can be solved recursively for all m ~ 0, n ~ 0, for example, if the initial
values u(m, 0), u(O,n) and the inputf(m, n) are given (see Fig. 6.11a). To obtain the
n -1 n •
It-I "
•
• m -1
• • • m ,
c
m-l • m+ 1
~'
m~ • D ....-<O·) ,
• • m=N+1
",
m m
lal Initial conditions for the _sal system. Cbl Initial and boundary conditions for the semicausal system. ,
,
n- 1 " ;n e N +1
m -1
m , , •
!
m+ 1
m=N+1
,,
, ,
m
Now the solution at (m,n) (sec Fig.6.l1b) needs the solution at a future location
'(m + 1, n). However, a unique solution for a column vector u, £; (u(l,n), ... ,
"(N, ll)jT can be obtained recursively for all n ~O if the initial values utm, 0) and the
boundary values u(O,jl) and u(N + L») are known and if the coefficients aJ,a2,a3
satisfy certain stability conditions discussed shortly. Such problems are called initial-
boundary-value problems and 'can be solved semirecursively, that is, recursively in one
dimension and nonrecursively in the other.
For the noncausal system, we have
u(m,n) -al(m -I,n) -atu(m + l,n)-a3(m,n -1) -a.u(m,n + 1) =[(m,n)
• •
This can seemingly be solved recursively for all m 2: 0, n :> 0 as an initial-value problem
if the initial conditions s On, O).u(O,n),u(-I,n) are known. Similarly, we could reo
index the noncausal system equation and write it as an initial-value problem. However,
this procedure is nonproductive because semicausal and noncausal systems can become
unstable if treated as causal systems and may not yield unique solutions as initial value
problems. The stability of these models is considered next
Noncausal systems
p q
A(zl;z~.)=l- 2: 2: a(m,n)zi'mzi' (6.79)
In'=-pn=-q
(n:. n)* to.O)
• •
•
212 Image Representation by Stochastic Models I:;hap.6
These are stable as nonrecursive filters
,
if and only if
Semicausal systems
p p q
Causal systems
p p q
A(Z1> Z2) = 1- 2: a(m,O)zlm- 2: 2:a(m,n) zlm zi"
m=l m=-pr.=l
, ' Now we consider the problem of realizing the foregoing three types of representa-
tions given the spectral density function or, equivalently, the covariance function of
the image. Let H (Zl. Z2) represent the transfer function of a two-dimensional stable
linear system. The SDF of the output u(m, n) when forced by a stationary random
field E(m, n) is given by
1
• .. Su(ZhZ2) = H(Zh z2)H(z'!\zi )S. (ZhZ2) . (6.82)
The problem of finding a stable system H(ZI,Z2) given 8u(ZhZ2) and S.(ZhZ2) is
called the two-dimensional spectralfactorizationproblem. In general, it is not possi-
Separable Models
If the given SDF is separable, that is, S(Zl> Z2) = S. (z,) S2(Z2) or, equivalently,
r(k, I) = r, (k)r2 (I) and 5, (Zl) and 52 (Z2) have the one-dimensional realizations
[HI (Z\), 5" (Zl)] and [H2(Z2), S'2 (Z2)] [see (6.1)J, then S(ZI> Z2) has the realization
H(zt.z2) ,d, H I (zl)H2(Z2) .
S, (Zh Z2) = S" (ZI)S., (Z2) . (6;85)
Example 6,10 (Realizations of the separable cavariance function)
The separable covariance function of (2.84) can be factored as r, (k) = 0- 2 pl~l, r2(1) =
pl~l. Now the covariance function r(k) ~ 0- 2 plkl has (1) a causal first-order AR realiza-
tion (see Example 6.1) A(z) ~ 1 -' pz -I, S. (z) = 0"(1- p2) and (2) a noncausal MVR
(see Section 6.5) with A(z)!b-a(z +z·'), Sv(z)=a2~'A(z), ~'~(l-p2)1
(1 + p'), n ~ pI(l + p2). Applying these results to r, (k) and r, (1), we can obtain the
following three different realizations; where o., 13" i '" 1,2, are defined by replacing p
by p" i = 1,2 in the previous ddi!'.itions
.
of
.
a and 13.. ,
,' .
Causal MVR IC1 model), Both r. (k) and r, (I) have causal realizations. This
•
gIves •
A (zt, z,) = (1- P, zi 1)(1 - Pzzi '),
•
2
S. (z" z,) = 0- (1- P1)(1 - p1)
u(m, n) = Plu(m -I,n) + p2u(m, n -1) (6.86)
Semicausal MVR (SC1 model), . Here', (k) has nencausal realization, and
. ,,(1) has causal realization. This yields the semicausal prediction-error filter
•
A(z" 4') "" [1- \X, (z , + z,I)](1- 112 Zit), .
S. (Zl, Z2) "" fl 21s. (1 - ~)[1 - \X, (z, + zll)]
u(m, n) "" Ot,[u(m -1,n) + u(m + l,n)] (6.87)
. +p,u(m,n -1)-p2(lI[u(m -1,n -1)
+u(m+1,ri-l)j+6(m,n) .
Noncausal MVR (NCt modell, Now both Tl (k) and '2(/) have noncausal
realizations, giving
A (Zl' Z2) "" [1- al (Zl + zj')][1 - U2 (Z2 + zi')],
S. (Z"Z2) "" 0" f3'i ~A (z" Z,)
•
u(rn, n) = ot,[u(m -1, n) + u(m + 1,It)J
(6.88)
" + Q.[u(m, n - 1) + u(m, It + I)J
- Qc,Qc,{u(m -i,1t -1) •
+u(m + 1,1t -1)+u(m -1,1t +1)
+ u(m + l,n + 1)] + 6(m, n)
.
This example shows that all the three canonical forms of minimum variance representa- .
,
tions are realizable for separable rational SDFs. For nonseparable and/or irrational
". SDPs, only approximate realizations can be achieved, as explained next. . .
For the given SDF S(ZI> Z2), suppose lIS (Zl, Z2) has the Fourier series
1 '" '"
• S
(
Zj,Z2)
== L L
"'=-"n=-'"
r+(m,n)zl"'zi',' zl";'e iw1, z2==e i"'" (6.90)
, I
1 - r + (m n)
1!,2:= r" (0, 0) , a (m, n) = r+ (0, ~) (6.91)
1 1
S(zr,Zz) = [ ( -I ' .')]-' 0<0:<4 (6.92)
. 1 - c< Zl + ZI + Z2 + Z2
Clearly, s:has a finite Fourier series that gives 13' = 1, a(m, n) "" a for (m, n) =
(:t1,0). (0, ± 1) and a(m, n) "" 0, otherwise. Hence the noncausal MVR is
u(m, n) = o:[u(m + 1, n) + u(m -1, n) + u(m, n + 1) + u(m, n -1)] + eim, n)
, (1 (k, I) = (0, 0)
r, (k, I) "" 1':(1, (k, l) "" (±I, 0), (0, :t1)
lo, otherwise
Realization of Causal and Semicausal MVRs
Inverse Fourier transforming S(el"',e iw,) with respect to (1)1 gives a covariance se-
quence r,(e iw,), 1 = integers, which is parametric in (1)2' Hence for each W2; we can
find an AR model realization of order q, for instance, via the Levinson recursion
(Section 6.4), which will match rr(e i"'2) for -:-q :s; l < q.
Let 13 (e iW2),an (e i" ,) , 1 S n S q be the AR model parameters where the predic-
2
tion error (32 (e jWZ ) > 0 for every W2. Now (32 (e i"',) is a one-dimensional SDF and can
be factored by one-dimensional spectral factorization techniques. It has been shown
that the causal and semicausal MVRs can be realized when 13-2 (e l"',) is factored by
causal (AR) and noncausal MVRs, respectively. To obtain finite-order models
an (e l"'2) and /32 (e l"',) are replaced by suitable rational approximations of order P, for
instance. Stability of the models is ascertained by requiring the reflection coeffi-
cients associated with the-rational approximations of an (e i" ,) , n '" 1,2, ... ,q to be
less than unity in magnitude. The SDFs realized by these finite-order MVRs can be
made arbitrarily close to. the given SDF by increasing p and q sufficiently. Details
may be found in [16}, ' .
,
r(k, l) = 2:2:. (lp,q tm, n)r(k - m, 1 - n) + I?>~,q '3(k, I); (k, l) El¥. (6.93)
(m,n) € W. '
where the dependence of the model parameters on the window size is explicitly A
shown. These equations are such that the solution for prediction coefficients on VII,
requires covariances rtk, /) from a larger window. Consequently, unlike the I-D
AR models, the covariances generated by the model need not match the given
covariances used originally to solve for the model coefficients, and there can be
many different sets of covariances which yield the same MVR predictors. Also,
stability of the model is not guaranteed for a chosen order..
In spite of the said shortcomings, the advantages of the foregoing method
are (1) only a finite number of linear equations need to be solved, and (2) by solving
these equations for increasing orders (p, q), it is possible to obtain eventually a
finite-order stable model whose covariances match the given rik, I) to a desired
accuracy. Moreover, there is a 2-D Toeplitz structure in (6.93) that can be exploited
to compute recursively ap , q (m, n) from ap-l,q (m, n) or ap , q -1 (m, n), and so on, This
yields an efficient computational procedure, which. is. similar to the Levinson-
Durbin algorithm discussed in Section 6.4.
If the given covariances do indeed come from a finite-order MVR, then the
solution of (6,93) would automatically yield that finite-order MVR. Finally, given
the solution of (6.93), we can determine the model SDF via (6.77), This feature
gives an attractive algorithm for estimating the SDF of a 2-D random field from a
limited number of covariances,
Example 6.12
Consider the isotropic covariance function r(k, l') = O.9vkj~+I2. Figure 6.12 shows the
impulse responses {-a (m, n)} of the prediction error filters for causal, semicausal, and
n n
-0.233 -0.3887 -0.0135 -0.0006 • -0,2518 . -0.0006
Causal Sernicausal
,
Figure 6.12 Prediction error filler impulse responses of different MVRs. The
. , origin (0, 0) is at the locanon of the boxed elements.
•
•
•
noncausal MVRs of different orders. TIle difference between the given covariances
and those generated by the•
models has been found to decrease when the model order is
•
increased [16). Figure 6.13 shows the spectral density match obtained by the sernicausal
MVR of order (2.2) is quite good.
(alOFT spectrum; fbI causal MVR spectrum; (c) 68micausal MVR spectrum.
with z, '" expfjo»), Zz '" exp(jwz). Results given in Fig. 6.14 show both the causal and
the semicausal MVRs resolve the two frequencies quite well. This approach has also
been employed successfully for spectral estimation when observations of the random
field rather than its covariances are available. Details of algorithms for identifying
model parameters from such data are given in [17].
10gS(zbZz) = . 2:
m .. n
2:
-00 = -::.t
c(rn, n)zi'''' zin, (6.94)
Then
S= S. ~S H'H- (6.97)
•
A+(ZhZZ)A-(zj,zz)- e
•
is a product of three factors, where
•
If the decomposition is such that A + (Z1> Z2) equals A - (Z1 1, Z2 1) and S, (zj, Z2) is an
SDF,then there exists a stable, two-dimensional linear system with transfer func-
tion H+ (ZI> Z2) = ItA + (Zh Z2) such that the SDF of the output is S if the SDP of the
input is S,. The causal or the semicausal models can be realized by decomposing the
cepstrum so that C+ has causal or semicausal regions of support, respectively. The
specific decompositions for causal MVRs, semicausal WNDRs, and semicausal
MVRs are given next. Figure 6.15ashows the algorithms.
-1 ~ -1 .
(6.99c)
Using (6.95) it can be shown that. C+ (Zl, Z2) is analytic in the region {izll = 1,
Izd 2: 1} U {IZII2: 1, Z2 = oo}. Since eX'is a monotonic function, A + and H+ are also
analytic in the same region and, therefore, have no singularities in that region. The
region of support of their impulse responses «: (m, n) and h + (m, n) will be the same
as that of c+ (m, n), that is, 51' Hence A + and H+ will be causal filters .
•
Semicausal WNDRs
C+ (Zh Z2):= ~
In
2:
0:: -till'
m
c(m, O)Zl + 2:
m=
2:1
ctm, n)Zlm Z2"
-«I It !lit
(6.100a)
. ~ ~ -1
II : •
r fm. n)
ICIJ
I
I
w(m, n)
•
a+ 1m, nl
-1
•
Causal MVR Semicausal WNDR Semicausal l\llVR
w(m, nl =i
'f0'l
,
1m,h n)eS,
.
at 8rwl5e
A
wlm,
"
n ) fa
ee
l - j.,ln),
,
h
ot e(\VI5e
.
1m, nleS.
-wIm
,
nl -
- (1, 1m, nleS.
~
lO, otherwise
w,(m, nl = .,(m)6fn) w,lm, n) = 0 w,(m,n! = -.,(n)
•
fal Wiener-Doob homomorphic transform method. '1ft is the two dimensional homomorphic transform;
• • • •
• • • •
• • • • •
-0.0044 0.0023 --0.0016 0.0010
• •
-0.9965 0.0034 -0.0024 0.0015 •
• •
-0.0101 0.0052 • -0,0039 0.0026
• 0.0004
•
-0.0169 0.0083 0.0008 -0.0063 0.0056 0.0006 •
• •
-0.0315 0.0144 0.0011 • 0.0019 0.0259 0.0052 0.0004
•
-0.0712 0.0271 0.0017 0.0004 -0.3866 -0.0385 0.0168 0.0010
I
-0.2'142 0.0426 0:0025 0.0005 rn 11.000 I -0.2753 0.0282 0.0014 •••
rt n 11.0000
-0.4759
-0.3517
-0.0299
0,0460
0.0238
0.0025
0.0016
0.0005'., ••
0.0005 m
-0.3866
0.0019
-0.0385
0.0259
0.0168
0.0052
0.0010
0.0004
m -0.0029 0.0363 0.0068 0.0007 0.0004 -0.0063
•
'0.0056 0.0006 •
-0.0111 "
0.0091 • •
0.0010 0.0004 • , :"0.0039 0.0026 0.0004
• •
-0.0070 0.0045 0.0007 0.0003 -0.0024 0.0015 •
-0.0045 0.0028 0.0004 • •
• --0.0016 0.0010
•
--0.0030 0.0019 0.0003 •
•
•
•
•
" .•
•
-0.0021 0.0013 0.0002 _' j""'" ,
, ., •
·
•
•
'
•
•
•
' '
•
•
• • , •
Semlcausal MVR, {32 = 0.09324
'"
S, (Zh Zz) == - 2: c(m, O)ZI'" •
(6.101c)
The region of analyticity of C+ and A + is {lzli = l,lzzl ~ I}, and that of S, and S, is
{lzll == l,\fzz}. Also, a + (m, ll) and h + (m, n) will be semicausal, and S, is a valid SDF.
I '
Consider the SDF of (6.83). Assuming 0<<X<,;;1, we obtain the Fourier series
logS = <X(ZI + %1 1 + Zz + zi") + 0(a2 ) . Ignoring O(a') terms and using the preceding
results, we get
CausalMVR
l), ,
C+=a(zl'+zi S.(ZI,Z,)=O
~ A + (Zt. %2) = exp( ":C''') "" 1- a(z,1 + zi'I), S. (XI,X2) = 1
Semicausal MVR
•
C+ = a(2:t + XI I)+ l
azi , S. =: -a{xl + Zt l
)
~ A + (Zl, z,) =< 1 - a(zl + ZII) - ,azi!, S. (Z1o z,) "" 1 -, cx(Zt + zll)
,
, Semicausal WNDR
•
•
et(%, + ZI-')+'
C,+ =:2 ·-1
' OI.Z2, 8.=0
Example 6.15
Figure 6.15b shows the results of applying the Wiener-Dcob factorization algorithm to,
the isotropic covariance function of Example 6.12. These models theoretically
,
achieve
perfect covariance match. Comparing these with Fig. 6.12 we see that the causal and
semicausal MVRs of orders (2,2) are quite close to their infinite-order counterparts in
Figure 6.15b. .
•
6.9 IMAGE DECOMPOSITION, FAST KL TRANSFORMS,
AND STOCHASTIC DECOUPLlNG
A convenient representation for images is obtained when the image model is forced
to lie on a doubly periodic grid, that is, a torroid (like a doughnut). In that case, the
sequences u(m, n) and etm, n) are doubly periodic, that is,
eim, n) = E(m + M, n + N)
•
• Example 6.16
Suppose the causal MVR of Example 6.7. written as •
where we have used the fact that u(m, n) is periodic. This can be written as
that is, v(k, I) is an uncorrelated sequence. This means the unitary DFT is the KL
transform of u (m, n).
• •
Noncausal
. . Models and Fast KL
.
Transforms
Example 6.17
•
The noncausal (NCZ) model defined by the equation (also see Example 6.7)
u(m;n) - a[u(m -l,n) + u(m + l,n) I
(6.103a)
· -r uim,» -l)+u(m,n +l)]=e(m,n)
becomes an ARMA representation when e(m, n) is a moving average with covariance
1 (k,/) = (0,0) .
ai'
r, (k, l) = f3 -0:" (k,/)=(±l,O) or (O,::t1) (6.1i)3b)
0, otherwise.
For an 111 x Nimage U, (6.103a) can be written as
•
QU + UQ = E + B 1 + B2
. ., '
, (6.104)
•••
",b~
where b., 1>;" b o, and b. are 111 x 1 vectors containing the boundary elements of the
image (Fig. 6.16), Q is a symmetric, tridiagonal, Toeplirz matrix with values j along the
· ll'rnl\lll U
•
.
b[
I
---- I +
•
•
U· V<' •••
IIf2 uncornllated
b:li
I
U I b.
1
,
•
•
-
+ >J;U·q,T • •••
" .. ~ random "ariabl.
----
bf . •
•
U·
Boundary
values. a •
.>(-1
•
• •
Figure 6.16 Realization of noncausal model decomposition and the concept of fast
KL transform algorithms. 'i' is tile fast sine transform.
•
,
main diagonal and -a along the two subdiagonals, This random field has the decom-
position
u = lJ'l+ Vb •
where Vb is determined from the boundary values and the KL transform of If' is the
(fast) sine transform. Specifically,
•
-
-. •
•
+ u·
• v: •
N uncorrelated
+ '!' AR processes
- yO• •
•
. •
•
\ ,
,
, • Sernlcausa!
model -
Boundary values
,
b.
__ u._0 , . •
Initial values •
Clearly, u~ is a vector sequence generated by the boundary values b. and the initial
vector uo. Multiplying both sides of (6.110) by 'If, the sine transform, and remem- .
bering that .r. = I,·.Q.T := A = Diag{l\(k)}, we obtain
.Q.T.U~=i'.U~_1 +.£., .u8= 0 ::}Av~='YV~ _I + e. (6.111)
where v~ and e. are the sine transforms of u~ and e., respectively. This reduces to a
set of equations deeoupled in k, .
Since En is a stationary white 'noise vector sequence, its transform coefficients en (k)
are uncorrelated in the k-dimension. Therefore, v~(k) are also uncorrelated in the
k-dimension and (6.112) is a set of uncorrelated AR sequences. The semicausal
model of,(6.87) also yields this type of decomposition [21]. Figure 6.17 shows the
realization of sernicausal model decompositions. Disregarding the boundary
effects, (6.112) suggests that the rows of aeolumn-by-column transformed image
using a suitable unitary transform may be represented by AR models, as mentioned
in Section 6.2. This is indeed a useful image representation, and techniques based
on such models lead to what are called hybrid algorithms [1, 23, 241, that is,
algorithms that are recursive in one dimension and unitary transform based in the
other. Applications of semicausal models have been found in image coding, restora-
. tion, edge extraction, and high-resolution spectral estimation in two dimensions.
6.10 SUMMARY
•
In this chapter we have considered several stochastic models for images. The one-
dimensional AR and state variable models are useful for line-by-line processing of
images. Such models will be found useful in filtering and restoration problems. The
causal, semicausal, and noncausal models were introduced as different types of
realizations of 2-D spectral density functions. One major difficulty in the identifica-
tion of 2-D models arises due to the fact that a two-dimensional polynomial may not
be reducible to a product of a finite number of lower-order polynomials. Therefore,
PROBLEMS
6.1 (AR model properties) To prove (6.4) show that il (n) satisfies the orthogonality
condition E[(u(n) - u (n))u{m)) = 0 for every m < n. Using this and (6.3a), prove (6.7).
6.2 An image is scanned line by line. The mean value of a pixel in a scan line is 70. The
autocorrelations for n = 0,1.2 are. 6500, 6468, and 6420, respectively.
a. Find the first- and second-order AR models for a scan line. Which is a better monel?
Why?
b. Find the first five covariances and the autocorrelations generated by the second-
order AR model and verify that the given autocorrelations are matched by the
model.
6.3 Show that the covariance matrix of a row-ordered vector obtained from an N x N array
, of a stationary random field is not fully Toeplitz. Hence, a row scanned two-dimensional
stationary random field does not yield a one-dimensional stationary random sequence.
6.4 One easy.method of solving (6.28) when we are given S(w) is to let H(oo) == v'S(w).
For K ~ 1 and S(w) == (1 + p2) - 2p cos u, show that this algorithm will not yield a
finite-order ARMA realization. Can this filter be causal or stable? •
. 6.5 What are the necessary and sufficient conditions that an ARMA system be minimum
;>bse? Find if the following filters are (i) causal and stable, (ii) minimum phase. •
a. H(z) == 1 - O.8z~'
b. H(z) == (1 - z -1)/[1.81 - O.9(z + z -')]
c. H(z)=(I-O.2z-')/(1-0.9z~')
6.6 Following the Wiener-Doob decomposition method outlined in Section 6.8, show that a
I-D SDF S(z) can be factored to give H(z) ~exp[C+ (z)], C'" (z) ~ :2:=-1
c(n)z-', K =
. exp[c(O)], where {c(n),Vn} are the Fourier series coefficients of log5(oo). Show that
H(z) is a minimum-phase filter.
6.7 Using the identity for a Hermitian Toeplitz matrix
I • •
R~ ,
I
6,,*
Rn + 1 = ,
•
•
b~ , reO)
where bn ~ [r(n), r(n - 1), ... ,r(lW, prove the Levinson recursions.
6.8 .. Assume that R is real and positive definite. Then a. (k) and P. will also be real. Using·
I
the Levinson recursions, prove the following. . •
• •
Solve this via the Fourier transform to arrive at (6.38) and show that S(z) "" (>2/A (z).
Apply this to the relation S(z) = Sv(z)IA(z)A(z·') to obtain (6.39).
·6.11 Show that the parameters of the noncausal MVR of (6.43) for pth-order AR sequences
defined by (6.3a) are given by
'_'~'a(n)o(n +k).
",-(k) - £., C2 '
n=O •
, ""
such that Ub (x), which is determined by the boundary values Uo and U, • is orthogonal to
uD(x). Moreover, the KL expansion of UO (x) is given by the harmonic sinusoids
•
•
k=I,3;5 ...• -L$x$L
•
k = 2,4,6 ...
(causal)
A (z, Z2) = 1 - a, (z. + zl') - a2zi ' - a, Z2 "(z, + zj ') (scrnicausal)
A (z, , Z2) = 1 - a, (z, + Zl') - a2 (z, + zil)
.; a3z2 I (ZI + zl')- a.zl' (zz + zi') (noncausal)
•
a. Assuming the prediction error has zero mean and variance i?>2, find the SDF of the
prediction error sequences if these are to be minimum variance prediction-error
filters.
b. What initial or boundary' values are needed to solve the filter equations
A (ZI ,ZZ)U(ZI ,Z2) = Fiz, ,Z2) over an N x N grid in each case?
6.17 The stability condition (6.78) is equivalent to the requirement that IH(z; .z2)1 <00,
Iz 11 = 1, IZ21 = 1. .
a. Show that this immediately gives the general stability condition of (Q.79) for any
two-dimensional system, in particular for the noncausal systems.
b. For a semicausal system H(m, n) = 0 for n < 0, for every m. Show that this re-
striction yields the stability conditions of (6.80).
c. For a causal system, we need hem, n) = 0 for n < O,V'm and hem, n) = 0 for
n = 0, m < O. Show that this restriction yields the stability conditions of (6.81).
•
6.18 Assuming the prediction-error filters of Example 6.7 represent MVRs, find conditions
on the predictor coefficients so that the associated MVRs are stable,.
. 6.19 If a transfer function Htz, , zz) = H, (z,)H z (Z2), then show that the system is stable and
(a) causal, (b) sernicausal or (c) noncausal provided H, (z,) and H 2 (z, ) are transfer
functions of one-dimensional stable systems that are (a) both causal, (b) one causal and
one noncausal, or (c) both noncausal, respectively.
,
6.20 Assuming the cepstrum c(m, n) is absolutely surnmable, prove the stability conditions
for the causal and sernicausal models.
6.21 a•. Show that the KL transform of any periodic random field that is also stationary is the
two-dimensional (unitary) DFT. . '
\ b. Suppose the finite-order. causal, semicausal, and noncausal MVRs given by (6.70)
are defined on a periodic grid with period (M, N). Show that the SDF of these
random fields is given by
Section 6.1
For a survey of mathematical models and their relevance in image processing and
related bibliography:
Sections 6.2-6.4
Further discussions on spectral factorization and state variable models, ARMA
models, and so on, are available in: .
Section 6.5
Noncausal representations and fast KL transforms for discrete random processes
are discussed in [1, 21] and: .
•
13. A. K. Jain. "A Fast Karhunen Loeve Transform for a Class of Random Processes."
IEEE T'rans. Comm. COM-24 (September 1976): 1023-1029. Also see IEEE Trans.
Comput. C-25 (November 1977): 1065-1071...
-
230 . Image Representation by Stochastic Models. Chap. 6
•
•
Here we follow [I], The linear prediction models discussed here can also be gen-
eralized to nonstationary random fields [1]. FGf more on random fields:
•
14. P. Whittle. "On Stationary Processes in the Plane." Biometrika 41 (1954): 434-449.
15. T. L. Marzetta. "A Linear Prediction Approach to Two-Dimensional Spectral Factoriza-
tion and Spectral Estimation." Ph.D. Thesis, Department Electrical Engineering and
Computer Science, MIT, February 1978. .
16. S. Ranganath and A. K. Jain. "Two-Dimensional Linear Prediction Models Part I:
Spectral Factorization and Realization." IEEE Trans. ASSPASSP-33, no. 1 (February
1985): 280-299. Also see S. Ranganath. "Two-Dimensional Spectral Factorization,
Spectral Estimation and Applications in Image Processing." Ph.D. Dissertation,
Department Electrical and Computer Engineering, UC Davis. March 1983..
17. A. K. Jain and S. Ranganath. "Two-Dimensional Linear Prediction Models and Spectral
Estimation," Ch. 7 in Advances in Computer Vision and Image Processing. (T. S. Huang,
ed.). Vol. 2, Greeenwich, Conn.: JAI Press Inc., 1986, pp. 333-372.
18. R. Chellappa. "Two-Dimensional Discrete Gaussian Markov Random Field Models for
• Image Processing," in Progress in Pattern Recognition, L. Kanal and A. Rosenfeld (eds).
Vol. 2, New
•
York, N.Y.: North Holland,198~. pp. 79-112.
19. J. W; Woods. "Two-Dimensional Discrete Markov Fields." IEEE Trans. Inform. Theory
IT-IS (March 1972): 232-240.
•
Section 6.8
Here we follow [1] and have applied the method-of [6J and:
•
• •
•
•
. .
Image Enhancement
7.1 INTRODUCTION
-
•
•
•
.
,
Point operations Spatial operations Transform operations Pseudocoloring
,
- Contrast stretching
It * Noise smoothing • Linear filtering • False coloring
• Noise clipping .' Medlan filtering • Hoot filtering , .. Pseudocoloring
*' W~ndow slicing • Unsharp masking .. Homomorphic fHtering
• Hinogrammodeling ;r'- ....,,-* 'Lew-pass. bandpass• •
•
high. pass filtering
.• Zooming •
Figure 7.1 Image enhancement.
233
•
! •
TABLE 7.1 Zero-me~£!Y~r~JQrJm~W'l~~.Qpem~!lt. Input and output gray level$ are distributed
"EetweenTO,l]. TypIcally, L = 255 .
1.. Contrast stretching- au,. Osu<a The slopes a, ~, 'Y determine the relative
f(u)=ijj(u -a)+v., a su <b contrast stretch. See Fig. 7.2.
'Y(u -b)+vb , b is u-c L
2. Noise clipping and 0, OSu<a Useful for binary or other images that
thresholding f(u) = {au, aSusb have bitqo<I&ill.stxjJ;>.u.timLof gray--··
L, u :i:! b levels. The a and b define the valley
between the peaks of the histogram.
For a = b = t, this is called
thresholding.
•
3. Gray scale reversal •
f(u)=L -u Creates digital negative of the image.
Contrast Stretching",
A special case of contrast stretching where (X = 'Y = 0 (Fig. 7.4) is called ~liQPing. ,,
This is useful for noise reduction when the input signal is known to lie in the range ..
[a, b]. .
Thresholding is a special case of clipping where a '" b ~ t and the output
becomes binary (Fig. 7.5). For example, a seemingly binary image, such as a printed'
page, does not give binary output when scanned because of sensor noise and back-
ground illumination variations. Thresholding is used to make such an image binary.
Figure 7.6 shows examples of clipping and thresholding on images.
•
v
,r/'
':ji;.-
I -----~------- , •
I
I
v" I
I
I
t,
I •
I Figure 7.2 Contrast stretching transfer-
V. I marion. For dark region stretch a> 1,
• ,
" I '
u
a exLI3; .midregion stretch, Ii> 1,
0 b L b ex ~ L; bright region stretch 'Y > 1.
,
,, • .' • • •
,
•
• ," .-
-, . •
, , ., ,,~
, ~ . • ....
'.
.- ""
• ~.# ii"
H.
~f'"
". ,
,- • ••
'.
•
-
•• t,
• ;-.,"
•
"
•
,
• " ;"'
•
•
mmmm
t;.<l! • • • ~.t¥«r'''' ~
I"
cun'! ~ t'I ;"f .. ~ ;r tf .... $I
is # ;;; ' Ii ,3 R : x' ;.Ill ! ~ 1t
"r;tw It ~ ~ 'ff '11l!;~ ,>';r ....
",ItIJ/,OtJlI:t';'!ff"fltl!::,:. Ill' };:' ~ :I' '" tl ~,' !t' >' Z "'"7 !J! ,
*'!":;!Jl4~<'4,","."ll:4'101 :7 :: '7 3 " '7 "f' '!t ':'" .,. V '1'
"'l:"iIi<!IJ,,(J'lijl!K'~r .. 11 if ,,: '~; -; '.' \t ,7 '!1 'r' 1"
, lI:' ;:, ') "'" ~ '" "" IT % .;;) 7"
if iti ,. II 'II ,. a It .: '4 " II!' l';'
.W';i..-r \!o n""·'·''''"''# .... .I;' 11 ::r Y 'l! " / 'Ii' }';; '1'
y'J;
t' % t: ';t ;f ".:- tr' '$ (; 17 '" F ~
.. "i/lr;'I)'$'!I''!It:?lf)'S -f •
• ,." .. ,," "'e i oN 1$ l.1 P i& 4 ::: ;lt 11 ;.} "!t \!<
"."'1
'If.<t'!t UI"I!l~. , t1&Z'(1,!7::'==~"':1rr ~
,
.~
IIII '.!' .'<l21lt ':' e filii
.... fj"V !J /.-
a."1i"6t'J"U!t'''$: ,...., ;,.
..
,.. ,
,•
'1I'rr'flt/l1t1l,lJ>..-at;l!1;J
1(f1\'!311~lIi$.,.~~;;1!
"
- ,"
I
f/I"..-:::Jfflf>;;.'<''''l!>'!l
.,1II:f'f",li:I lll;:;N'f!Il!l,,"
1I''I!l1!<olW'lt'lo1l!1l'!:'1ll'>f\l'
Ii,I ... •
c'
, .'
••• .at"1FS-vi»<'!I':$tt~tt. ~,
!!
" .. " , WI ... 10f 11' '" .. ;::<
_.~
,""
"-..L< ~.,
""
~*t31"""
II « ',"
,•
•
,
. • 'I
. .. ~ ,
'.
'
~
'L
.
". ,""
••
_,.--,,..~,,,,,.._., ,~"",~_",,,~_ ,7
j•.",'-
>:i
l,
,
'j • • • ......., ~ ~.
-- • , ,.'
r•
" I
• •, •
,
•
• •
I \
"
zao
•
•
Image
histogram
v r
\
\ I I
I \:/
\ II
\
,'/
"I --- I ""-
U II U
b
v v
• --,--r-.......--.- u
•
H •, •
-. s'."'.
,
•
• •
•
•
Figure 7.6 Clipping and thresholding.
237
v
•
Digital Negative
A negative image can be obtained by reverse scaling of the gray levels according to
the transformation (Fig. 7.7) .
~ _._-_.- " -----',
Iv =L - uI f
(7.3)
<
~
---"'
.
Figure 7.8 shows the digital negatives of different images. Digital negatives are
useful in the dis£lay'of medical images and in producing negative prints of images.
•
, .,. !
..." -.'.
,.
"'
'{
-
.. , -
.
' ,,-, " '
~.
(a) (b)
Figure 7.8 Digital negatives.
v t I
I
I
I I
I I
L f------- I I
I I
I
I I I
I I I
I I I
I I I
45° I i I " ,
, U u
a b 8 b L
, , ..
Ie) Without background (bl With background
Bit Extraction
I I
, "
" ,
. '. .
. "
•
,,t
t.
iI
where
.t n =
A I nt [ 2 u-
B n , Int(x J ~ integer part of x (7.9)
Figure 7.11 shows the first 8 most-significant bit images.of an 8·bit image. T1}i§
JTllusformatioD is useful in ~termining..1:he number, QLY.isually-significaut bitStIl,
. ~n image. In Fig. 7.11 only the first 6 hits are visually significant, because the
remaining bits do not convey any information about the image structure.
Range Compression;~
--
."'~-'-
\ \
,
"
,f;:
• ,
•
. •
•
(,
ii, • • '. ,,- :
'"
f " -,,'
"
t-,l, ,:
... - .~.~. '¥
~..
.. ~.
;;: - •
- ";
•
'-",
•
-
.;
,,, .
-""'-:.".-
'
,- -
•
s
,
• • • • • ','
.-,"','-;'
. "."
• 'N
, . ,
,-
, '-
:'.'
..,-,
.
.• "'.,. . .
. , c,-
·· -.' . . H. ..-" _
,"'. -',
.",
" '-;
;'~. . ".,
:. -;
- - .'
. .. :,'-, .s.,
,
.. .
". ' ,
, .
" ,',.
'? -
Ii
' /'
,
•
. ". . .-
";;,' ;.,0',.'-
• " . - ','
- -
A
• •
-'.'-'
,-,
.- . " "
.-.,' ,
,"
.
•..
"
- .; -. .- - -
'
" '_"._~V'_-,-
-',
~
. -,.-.\:"'.;-. '. ~
.. - ... , '.:
"
• • •
in a body. The blood stream is injected with a radio-opaque dye and X-ray images
are taken before and after the injection. The difference of the two images yields a
clear display of the blood-flow paths (see Fig. 9.44). 9ther applications of change
detection
......-..
are in security monitoring systems, automated inspection of . ~
,--- -- -,- ,,-_.~ ,--_.
histogram of the image that gives h (XI)' the number of pixels with gray level value Xi.
Then .
i = 0, 1, ... , L - 1 (7.12)
i=O
•
The output y' , also assumed to have L levels, is given as follows: t
(7.13b)
where V""n is the smallest positive value of v obtained from (7.13a). Now v will be,
uniformly distributed only approximately because v is not a uniformly distributed '
variable (Problem 7.3). Figure 7·.13 shows the histogram-equalization algorithm for
digital images. From (7.13a) note that Y IS a discrete variable that takes the value
k
Vk = 2:
- i"'O
p; (Xi) (7.14)
if U = Xk. Equation (7.13b) simply uniformlY regi!antizes the set'{ Yk} to {vk}; Note
that this requantization step is necessary because the probabilities P. (Xk) and p, (Yk)
are the same. Figure 7.14 shows some results of histogram equalization.
Histogram Modification
A generalization of the procedure of Fig. 7.13 is given in Fig. 7.15. The input gray
level U is first transformed nonlinearly by f(u), and the output is uniformly quan-
tized. In histogram equalization, the function
•
• • ,
»,
- .
Other choices of f( u) that '
have similar behavior are '
t We assume .1:0= O.
,
•
,, •.
. j .
,
'. "' '
I "
n
-, '
. I
'.
~
, £
. ---
"
"
~-- •
•
~-
,
fa) Top row: input image_ its histogram; bottom row: fb) left columns: input images; right columns: proc-
processed image, Its histogram; essedimages,
•
•
Histogram Specification
Suppose the random variable u ;:;:0 with probability density Pu{tt) is to be trans-
formed to v 2: 0 such that it has a specified prot ability density pv (0'). For this to be
true, we define a uniform random variable
•
u V Uniform v·
(I'll quentization
• • 'Flgure 7.1: Histogram modification.
•
If u and v are given as discrete random variables that take values Xi and Yi,
i = 0, ., . , L - 1, with probabilities pu (Xi) and P» (Yi), respectively, then (7.21) can
be implemented approximately as follows. Define
u k
A - A
w:: 2: po (Xi), Wk = 2: p; (Yi), k = 0, ... ,L - 1 (7.22)
Xj=O > i=O
Let w' denote the value W. such that W. - IV 2:0 for the smallest value of n. Then
V' = Yn is the output corresponding to u. Figure 7.16 shows this algorithm.
•
Example 7.1
Given Xi "" Yi "" 0, 1,2,3, pu (Xi) = 0.25, i = 0, ... , 3, PV(YQ) = O,P,(YI) = pv(Y2) = 0.5,
p, (Y3) "" O. Find the transformation between u and v, The accompanying table shows
,, how this mapping is developed.
u Pu(Xi) w -
w. w· n v· ,
0 0.25 0.25 0.00 0.50 1 1
1 0.25 0.50 0.50 0.50 1 1
2 0.25 0.75 1.00 1.00 2 2 P« (Xi) = ?
3 0.25 1.00 1.00 1.00 2 2 •
•
'1.4 SPATIAL OPERATIONS
where y(m, n) and vern, n) are the input and output images, respectively, W is a
suitably chosen window, and a(k, I) are the filter weights. A common class of spatial
. averaging filters has all equal weights, giving ---, . .
I • I ,
o 1 -1 0 1 -1 0 1
•
k 0
,
1
4 -
1
4 k: -1
1
9
•,
9
1
9
k -1 0 -a
••
0
1, -
1
-
1
o -•
1 1 1 1 1 ,
4 4
•, 9 e 4
,
8
1 1 1
1 0 0
9 9 9 8
where 1l(m, n) is white noise with zero mean and variance (J'~. Then the spatial
average of (7.~4) yields
1 ' .
v(m,n) = V 2:2: uim -k,n -l)+'l](m,n)~.27)
J. W(k,l)4H' ~..;I
~ u,
"
where 'l] (m. n) is the spatial average of 'l](rn, n). JUs a simple matter to show_that
.TJ (m, n) has zero mean and variance"i:T~ = g;IN", .jhat is, the noise poweris reduced
by a factor equal to the number of pixelsin the window W. If the noiseless image
u(m, n) is constant over the window W, then spatial averaging results in an im-
provement in the output signal-to-noise ratio by a factor of N w • In practice the size
of the window Wis limited due to the fact that u (m, n) is not really constant, so that
spatial averaging introduces a distortion in the form of blurring. Figure 7.18 shows
examples of spatial averaging of an image containing Gaussian noise.
Directional Smoothing
and a direction.'~"
e* is found such that Iy(m, n) -
~,,,,,
v(m, n : e*)1 is minimum. Then
vern, n) = v(rn, n : eO) • (7.29)
gives the desired result. Figure 7.20 shows an example of this method.
•
, " . .
,
!• -'
, ;.. " .
,
- •
, •
'''.~
~ "-.
- -,?-
..,,,-
'F "
"
' " ,'"
. "¥,,-,,
~;
?If' ....
--:;,
•
.
'"
-{
~.
-",
• :ij'
""
Ib) noisy
,
la) Original
- •
,~ ,
,;Il
.4l
\
,- '"
"\
,', "
,,
\'
<,
j, . .
•
•
...:1 A
Ie) 3 x 3 fitter (dl7 x 7 filter
•
,
Figure 7.18 Spatial averaging filters for smoothing images containing Gaussian
•
norse,
.
Here the input pixel is replaced bythqlledUlILofthe pixels contained in a window
around the pixel, that is, .
v(m, n) = median{y(m - k, n -l), (k, I) E W} (7.30)
where W is a suitably chosen window. The algorithm for median filtering requires
arranging the pixel values in the window in increasing or decreasing"order and
g;picking the middle value. Generally the window size is chosen so that N; is odd; If
N.. is even, then the median is taken as the average of the two values in the middle.
• •
•
,/1
.v (0) ~ 2 (boundary value), v (1) = median {2, 3, 8} == 3
.,;,¥'i
.• , .
. '
'- !
.' I
, )/<,;;:'
.~ "
_.. -
"
•
" •I
. '
•
. ,• • ".'
•
•,"
,:;-~ -•
,- '
Figure 7.20
Since the median is the (N", + 1)/2 largest value (N", "" odd), its search requires
(N", - 1) + (N", - 2) + ... + (N", -1)/2 "" 3(N; -1)/8 comparisons. This number
equals 30 for 3 x 3 windows and 224 for 5 x 5 windows. Using a more efficient
..
""",
"1'" .. '?-_!' -
\i
,
•
,
~ ,
,T<' . .)~
,
•
. ..
"-~-'r:ve
;;;'1; .. c, . '4
,
I
, . ,
•
r'-~"c>" '''-~'\ ".""'Y.~~,/-'"7;7 .... IS' ',-5>!'n ii . _+.,.l'<i ,f·, Y'.)!"-,S,"',"",,,,kW l~'._-,_;_."ilf'\_»\ .h,' i Ik".,ii\i'----_ ",!?,n4;+:"h:'V(f_t';<~
- --<'{
'•'
"f
;:fi
t1 ,
~. ;-,
, •
•
•~-
•
•t· •
•
, . , 1
\
•"J. -
r'''''''- . •
7f
'" :' -~
•
•
•
,j
,
.";
,•
-,0' ,j
f';
'v_ ..
- •
•
j
•
:. ..' ..
•
•
•
..• I
,. , . - , ,
•
~'-'.'- - •
,
"
!
•
, -:-1-j
•
'1
,·•
-'.~
F ,
4- _r
.-
•
V·
,
,-"
!(: ;'. , d""""oi"__w,_,W'"~'. ,,_ - ,.• _"'_'c.,.~
•
j'1,i..:i.
~.
_.#'" '.00"._"._.",._".,.,.._",-=_,__."" . ."f_i." -'if
'.-r- /' ~""~"'~_~_
,'i!(JEl}~~i
•
[c) image with Gaussian noise' (dIS x 3 median filtered.
Figure 7.21 Median filtering .
•
HtfHHHH "
Figure 7.22 Spatia! averaging versus
.;. II
median filtering, \ II b
.J,., (a) Original c d
(b) with binary noise
(c) five nearest neighbors spatia! average
(d) 3 x 3 median filtered
An alternative to median filtering for remoYil}g bin<j[Lor iso!!,tt.:d noise is to find the
spatial average according to (7.24) and replace the pixel at m;n by this average
whenever the noise is large, that is, the quantity Iv(m, n) - y(m, n)j is greater than
some prescribed threshold. !:.2r additive Gaussian noiseJ more sophisticated
. §moothing algorithms are possible. These algorithms utilize the statistical proper-
ties of the image and the noise fields, ,AQaptive al 0 . that adjust the filter
response according to local variations in the statistical properties of the data are also
possible. In many cases the noise.is multiplicative, Noise-smoothing algorithms for
such images can also be designed. These and other algorithms are considered in
•
o
Chapter 8.
The unsharp masking technique is used commonly in the printing industry for
crispening the edges. A signal proportional to the unsharp, or low-pass filtered,
version of the image issubtracted Iromthe.Image. ~'hi~valenttQa.dding the '0'
~gradient, or a high-pass signal, to the image (see Fig. 7.23). In general the unsharp
masking operation can be represented by - ..
•
" •• "F v (/;', n) = u(m, n) + Ag(hl;n)~; (7.31)
r
where A> 0 and gem, n) is a~1i:--defined gradient at (m, n), A'cornmonly used
gradient function is the discrete LaE!.acian_
Signal
------
121 low-pasS
_ _ _ .-I' i
!
. (31 (11 - (2) High-pass
•
(II fM3)
•
•
• • •
·, , ,
, . •
{."
'\ <if , ".-
•
•
·• , "
• , . •
... ,
, , "..
,
•
..
•
, •
. • '
, ,. '-"4
'
, ••
•
• , "A:""'
'.41'
•
, •
f, ' - ,
, '
0; '1'
•
,
• ,
•
•
,
, -,-
•
"
:,/ ',F?L!
I' •
\
(,
V' _~""_g _. '
4:~ " ~ ""'4i--
~,_, '.')0'( - It: ".. _
. ~ow-pass fili@fS
are useful for noise..§m.QothitI aT!<:l.jn1~Q.D. !:Qgh-pa~s.
filters are usef!!Lil1extractirtg edges and in sharpenii ~ imagc.§.Band-pass filters are
psefu! in the enhancement of edges and other high-pss image characteristics in tlje •
presence ofnoise. Figure 7.26 shows examples of higl -pass, low-pass and band-pass
filters.
,. }'
. "
i
•
y
, ;.
I
.., , . •
, v. .".
•
~-.-
,. .-
(e)
Figure 7.26 Spatial filtering examples.
Top row: original, high-pass, low-pass and band-pass filtered images.
Bottom row: original and high-pass filtered images.
where ~ is the average Il,lminanc~ of the object and Sl. is theJl!<m9arcLdeyiatio.n.of the
luminance of the objectJ2lus its surround. Now consider the inverse contrast ratio
transformation .
_ fL(m. n)
v (m, n) - (J' (m, n) (7.36)
where I.l.(m, n) and ctm, n) are theJQcw mean and standard deviation of ulm, n)
measured over a window Wand are given by .
1 .
1
<
!k(m, n) = . 2::2:
Nw(k.I)<W
u(m - k, n -I) (7. 37a)
"
., 1 L'2
lcr(m,n)= Nw(~tw[u(m-k,n-l)-I.l.(m,n)]2 . (7.37b).
This transformation generates an image where the weak (that is, .!ow-contrast)
~are enhanced. •...A sIiecial case of this is the transfor~_
.- ,
. . ' . --
_;-- .. -'-~_ '- ~.- -_.~
.
_ Jt(m, n)
( )
vm,n-( ) .(7.38)
o m, n
•
~
,"
, ,
, ,,•
,
. '
~
r ,'#(7:'84
'-~·i\i)j ",>00,_'_»""_"'"'' ,v..",,_.__ _- i ' ''';; ,olpiif~
Figure 7.27 Inverse contrast ratio mapping of images. Faint edges in the originals
have been enhanced. For example, note the bricks on the patio and suspension
cables on the bridge,
H= [i iJ (7.39)
This gives
(a)
.... ~
....
J' ....
. ,.r·t
Hi
.~
- ,
•
•
""-
I 1.-W;;
•
,••.
r ....
•
f' ;,-- .
, /' II ." 1
iI' ill iI' lli'll ~
! ,
~$ 4l'
'"
• .._, .,-
-<,,-"" ',"
t I
,
~,
- '"~ • ill
•
.,
•
I
I
,
I '
1 \
(b)
Figure 7.28 Zooming by replication from 128 x 128 to 256 x 256 and 512 x 512 images.
Linear interpolation of the preceding along columns gives the first result as
V (2m, n) = VI (m, n)
.v(2m +.1, n) =~ [VI (m, n) + VI (m + 1, n)], (7.42)
Osm sM -l,OsN S2N-1
Here it is assumed that the input image is zero outside [0, M - 1J x [0, N - 1J. The
•
above result can also be obtained by convolving the 2M x 2N zero interlaced image
with the array
I
l ,2 t-
H= i-' m 1
2 (7.43)
! !
4 !
•
o
0
0
1
0
0
0
32t0-.5
o 0 0 0
3 Z
1,5 1
1
0.5 0.25
0.5
L-
- - • ...
•
la}
,
I
•
•• c-4c.;,,~
• , .. ,co., '.y.. "
'!
, ..
•
• • ;j
~ __ .,)l ~-~ ..
•
\. '~_~',,"r}j}tc)x.x.w:D~e"J6,F
(bl
Figure 7.29 Zooming by linear interpolation from 128 x 128 to 256 x 256 and
512 x 512 images.
whose origin (m = 0, n = 0) is at the center of the array, that is, the boxed element.
Figure 7.29 contains examples of linear interpolation. Inmost of the image process-
ing applications, linear interpolation performs quite satisfactorily. Jligher-or@L .
(say, p) interpolation
.- is.PQssiblelu'.!lilddillg each.row and each column oftheJnput .
.image by prows and p columns ofzeros, respectively, and convolving it p times with
If (Fig. 1.30), For example p = 3 yields a cubic sElille inter.l221atioll,}n between the
pixels,'
,
1 2 ••• p
•
Figure 7.30 pth order interpolation.
.....:.-.-.-.~., ,. ~
•
,
7.5 TRANSFORM OPERATIONS
. .
In the transform operation enhancement techniques, ,zero:!!leI!lon::,opc:!:ations are
performed on a transformed image followed by the inverse transformation, as
shown in Fig. 7.31. We start with the transformed image V = {v (k, i)} as
V = AUAT (7.44)
where U,.. {u(m, n)} is the inp\lt iI!l~M' Then the inverse transform of
v' (k, i) = f(v(k, l» . (7.45) .
gives the enhanced image as
(7.46)
•
•
/ -... / .
o a bN-bN-aN-l
k c k
i d
'HPF
i
N-d
HPF
N-c
-,
<: , ;
Figure 7.32 Examples of zonal masks g(k, 1) for low-pass filtering (LPF), band-
pass filtering (BPF), and high-pass (HPF) filtering in (complex) DFT and (real)
orthogonal transform domains. The function g(k,/) is zero outside the region of
support shown
. ,for the particular filler. •
• •
256 -
lmaae Enhancement
'
. Chap. 7
•
,,
r"::~-"/-1
.
I
•
DCT
•
,
•
I
I ,;1
,,;*
,,,,.
,
'"
"• • "
r,l } '--,} • ,"''''''-.' ")'}"'-<' :····01
Hadamard Transform
.
!'i "'.....
c~ ,f?
_ .
;;- ''$.
"A"""'\"
II t *
.
" .....,. ,.
~ . ~, ,\'1.
", 4","0"",.. /\:
'.;,.. "';;:~;'f_ ,
"" ~; .'.", l'h;> H ' t~·, >.. f. '," _k"., , .... , .... """,. , •. -~,.
"r·-:, e
J
, I
• •
i•
','
DFT
Figure 7.33 Generalized linear filtering using ocr, and Hadamard transform and
OFT. In each case a b (a) original; (b) low-pass filtered; (c) hand-pass filtered;
•
c d
•
257
•
A filter of special interest is the ifH'l!r~fL Gm;,.££imdilter, whose zonal mask for
,V x S im3g~sis d.:rlned as --.- .
(k 2 + 12)}. .. N
g(k, I) "" exp 20.2 f { ' Os, k, 1 S,2 (7.48)
19(N .- k, N -l), otherwise
•
when A· in (7.44) is the • DFT. For other orthogonal transforms discussed in
Chapter 5,
os, k, I s, N - 1 (7.49)
Root Filtering
, • ,
•·• ,
•
• ,
,.- '.
•
" ,t t ~
J( h'7, t' ,",
,~_ 2J ~~- "r:'~.~
Originals Enhanced l.owpass filtering for noise smoothing
Inwtell Gaussian filtering ru'"'T';b' (a) Original
I-'-H (b] noisy
c d (c) peT filtar
• {dl Hadamard transform filtar .
If the magnituce term in (7.51) is replaced by the logarithm of Iv(k, 1)1 and we
r
denne
s(k, I) A [loglv(k, 1)lJe j9(k, l) , Iv(k, 1)1 >0 (7.52)
then the inverse transform of stk, I), denoted by c(rn, n), is called the generalized
cepstrum of the image (Fig. 7.36). In prac~ice a gosi.1ive constan1.ls added to lv (k, 1)1
"
to prevent the logarithm from going to negative infinity. The image c(rn, n) is also
- _'d .-"~
f--------~------------------------, ,
, I
Image I Oepstrum
u{m, nl I vlk, I) $( k,l) ! elm, n)
I
AUA T /log I v( k, III ] "i.I', n A'1S(A TI"
I
I I
I
L . ._~
I
,
(b) InverseJ:lomomorphic transform. 3C.
Zonal 3C. ,
3C mask
,
•
,
._ .- • , _j ,0,.,,'
r' -.- ,
. ,
J",,<,_ ~
•
,, '\
..) -:
' -" '
.
f'
i• r
,
I, ij
'J•
5,-'
J,
,
,'
,
..• ,
I . I ..
~
•
cepstrurn of the building image generalized cepsrra
(a) original 1a b
(bl DFT c d
Ic) DCT
(dl Hadamard transform
Figure 7.37 Generalized cepstra:
called the ,generalized homomorphic transform, !;l', of the image uim, n). The
, generalized homomorphic linear filter performs zero-memory operations on the
..... _. __' _ ' _ S " _.. ,"" ~,
In multispectral imaging there is a sequence of I images U, (m, n), i = 1,2, ... ,J,
where the number I is typically between 2 and 12. Itis desired to combine these
images to generate a single or a few display images that are representative of their
leatures. There are three common methods of enhancing such Images.- .... ~-'
Intensity Ratios
The log-ratio L;,; gives a better display when the dynamic range of R i,; is very large,
which can occur if the spectral features at a spatial location are quite different.
Principal Components
The 1 x I kr:rtransform of u(m, n), denoted by <P, is d~termined from the auto-
correlation matrix of the ensemble of vectors {U; (m, n), i = 1, ... , 1}. The rows of
$, which are eigenvectors of the autocorrelation matrix, are arranged in decreasing
order of their associated eigenvalues. Then for any 10sl, the images v;{m, n),
i = 1, ... ,10 obtained from the KL transformed vector .
vim, n) = $u(m, n) (7.56)
are the first 10 principal components of the multispectral images.
Figure 7.38 contains examples of multispectral image enhancement.
i.:';-
>'
, ,
•
•
•
., :\
• •
Since we can distinguish many more CC!0rs than gray levels, the perceptual dynamic
range of a display can be effectively increased by coding complex information in .
. color. False color implies mapping a color image into another color image to
provide a more striking color contrast (which may not be natural) ,to attract the
attention of the viewer,
. Pseudocolor refers to mapping a set of images u, (m, n), i = 1, ... ,1 into a
color image. Usually the mapping is determined such that different features of the
data set can be distinguished by different colors. Thus. a large data set can be
presented comprehensively to the viewer.
v1 (m • nj R
Input • 2 (rn, nl Color . I
Feature G
•
coordinate Display
extraction v3(m.nj transformation
Images B
u;(m, nj
To T'3 B'
Monochrome
image enhancement
•
•
7.9 SUMMARY
,•
In this chapter we have presented several image enhancement techniques sccom-
. panied by examples. Image enhancement techniques can be improved if the en-
hancement criterion can be stated precisely. Often such criteria are. application-
dependent, and the final enhancement algorithm can only be obtained by trial and
error. Modem digital image display systems offer a variety of control function
•
switches, which allow the user to enhance interactively an image for display.
PROBLEMS
•
7.1 (Enhancement of a low-contrast image) Take a 25¢ coin; scan and digitize it to obtain
. a 512 ;( 512 .mage. '
a. Enhance it by a suitable contrast stretching transformation and compare it with
histogram equalization. , . .
,
b. Perform 'uusbarp masking and spatial high-pass operations and contrast stretch the
. results, Compare their performance as edge enhancement operators.
7.2 pSing(7,6) and (7.9), prove the formula (7,B) for extracting the nth bit of a pixel.
7.3 a. Show that the random variable v defined via (7.11) satisfies the condition
Prob[v s u J== Prob[u Sp-l (tr)) == F(P-l (e-) == 0', where 0< 0-< 1. This means v .
is uniformly distributed over (0, 1).
b. On the other hand, show that the digital transformation of (7.12) gives p, (vd ='
1'''(.'', l. where 1'. is given by (7.14).
7.4. :rk,,·"'l' :Ill .ll,:,'f'ithm for (a) an M x ,If median filter. and (bl an}4 x 1 separable
mcclr.ru tittet; ~h.-lt minuuizcs the number of operations required for filtering 1'1 X N
images, where N 3> M. Compare the operation counts for M =' 3,5. 7.
7.5* (Adaptive unsharp masking) A powerful method of sharpening images in the pres-
ence of low levels of noise (such as film grain noise) is via the following algorithm [15].
The high-pass filter
-1 -2 -11
r.1 -2'
-1
12
-2
-2J
-1
which can be used for unsharp masking, can be written as 1~ times the sum: 'of the
following eight directional masks (H. ): . .
000 -100 0·2 a 00-1 000 000 000 000
-220 010 020 010 o2·2 o l. o 020 010
000 000 000 {) 0.0 . {) 0 a 00-1 0-2 a -100
The input image is filtered by each of these masks and the outputs that exceed a
threshold are summed and mixed with the input image (see Fig. n.S). Perform this
algorithm on a scanned photograph and compare with nonadaptive unsharp masking.
, ,
Ima9" Directional v. A 1:: vfJ T
Enhanced
Threshold" 1:
mask H. 6: I va 1>'1 image
I I I I
.. ~ : ,
• ,
, . , ..
r
Figure P7,S Adaptive unsharp masking.
7.6* {Filteringusing phase) One of the limitations of the noise-smoothing linear filters is
that their frequency response has zero phase. This means the phase distortions due to
noise remain unaffected by-these algorithms. To see the effect of phase, enhance a noisy
image (with, for instance, SNR =.3 dB) by spatial averaging and transform processing
such that the phase of the enhanced image is the same as that of the .original image (see
Fig. P7.6), Display it (m, n), ii (m, n) and u(m, n) and compare results at different noise
levels, If, instead of preserving e(k, I), suppose we preserve 10% of the samples
1\i(m, n) ~ [IDFT{exp jaCk, I)B that have the largest magnitudes. Develop an algorithm
for enhancing the noisy image now. , ,
•
OFT Measure phase
I, ej 9tk,l l •
I
i
u(m, nJ +
Noise smoothing
A
u{m, n)
OFT
A
lv(k,l}i
x IOFT
-
u(m. nl
1: . .
+ J
•
•
Noise f/{m, n)
•
BIBLIOGRAPHY
Section 1.1
7. R. Nathan. "Picture Enhancement for the Moon, Mars, and Man" in Pictorial Pattern
Recognition (G. C. Cheng, ed.). Washington, D.C.: Thompson, pp. 239-266, 1968.
8. F. Billingsley. "Applications of Digital Image Processing." Appl. Opt. 9 (February
1970): 289-299.
9. D. A. O'Handley and W. B. Green. "Recent Developments in Digital Image Processing
at the Image Processing Laboratory at the' Jet Propulsion Laboratory." Proc. IEEE
60 (1972): 821-828. '
•
10. R. E. Woods and R. C. Gonzalez. "Real Time. Digital Image Enhancement," Proc.
IEEE 69 (1981): 643-654.
•
Section 7.4
Further resultson median' filtering and other spatial neighborhood processing tech-
niques can be found in: .
.I
11. J. W. Tukey. Exploratory Data Analysis. Reading.-Mass.: Addison Wesley, 1971.
Section 7.5
For cepstral and homomorphic filtering based approaches see [4, 5] and:
•
16. T. G. Stockham, Jr. "Image Processing in the Context of a Visual Model." Proc. IEEE
60 (1972): 828-842.
Sections 1.6-7.8
•
•
.
,
•
• •
Image Filtering
and
Restoration
8.1 INTRODUCTION
, •
O'sp ,":yor
reco n"
,- •
F~ '8.1 Digital image restoration system.
267
•
Imagi] r8-;tor3rion
,,
,
,• , .
•
,
properties of a data set, whereas image enhancement techniques are much more
image dependent. Figure 8.2 summarizes several restoration techniques that are
discussed in this chapter.
where u(x;y) represents the object (also called the' original image), and v(x,y) is
the observed' image. The image formation process can often be modeled by the
linear system of (8.2), where h (x, y;x', y ') is its impulse response. For space invari-
ant systems, we can write
h(x,y;x',y'),;h(x -x',Y -Y';O,O)~h(x ':"x',y -y') (8.4)
The functions f(·) and g (.) are generally nonlinear and represent the characteristics
of the detector/recording mechanisms. The term 11(X, y) represents the additive
noise, which has an image-dependent random component f[ g (W)]111 and an image-.
independent random component 112'
Table 8.1 lists impulse response models for several spatially invariant systems.
Diffraction-limited coherent systems have the effect of being ideal low-pass filters ..
For an incoherent system, this means band-limitedness and a frequency response
obtained by convolving the coherent transfer function (CTF) with itself (Fig. 8.4).
Degradations due to phase distortion in the CTF are called aberrations and manifest
themselves as distortions in the pass-band of the incoherent optical transfer function
(OTF). For example, a severely out-of-focus lens with rectangular aperture causes
an aberration in the OTF,as shownin Fig. 8.4.
Motion blur occurs when there is relative motion between the object and the
camera during exposure. Atmospheric turbulence is due to random variations in the
refractive index of the medium between the object and the imaging system. Such
degradations occur in the imaging of astronomical "Objects; Image blurring also
occurs in image acquisition by scanners in which the image pixels are integrated
over the scanning aperture. Examples of this can be found in image acquisition by
radar, beam-forming arrays, and conventional image display systems using rele-
Atmospheric turbulence 2
exp{ -7I'0/.2(x + y2)}
H{~,.OJ
•
t cTF 01 a coherent diffraction limited svstern
• OTF of an incoherent diffraction limited system
-~~'
•
,,
,." ....
'" " ",,-
-1.0 -0.5 0.5 . 1.0 ~,
•
Figure 8.4 Degradations due to diffraction limitedness and lens ~l.>erration.
vision rasters. In the case of CCD arrays used forimage acquisition, local inter-
• actions between adjacent array elements blur the image.
Figure 8.5 shows the PSFs of some of these degradation phenomena. For an
ideal imaging system, the impulse response is the infinitesimally thin Dirac delta
function having an infinite passband. Hence, the extent of blur introduced by a
system can be judged by the shape and width of the PSF. Alternatively, the pass-
band of the frequency response can be used to judge the extent of blur or the
resolution of the system. Figures 2.2 and 2.5 show the PSFs and MTFs due .to
atmospheric turbulence and diffractiqn·limited systems. Figure 8.6 shows examples
of blurred images.
hlx, Y)
•
..'...1---...,
"0
---------:I---L .... x
r
/
(a) One dimensional motion blur
h{x, 0) h(x.O\
•
'--__ x '"-----x
o
y ,
Y
fbI Incoherllfll diffrllClion limlled ~.tem (Ie';. cutoff) (c) Average atmospheric turtlulei1ce
,
FIgure 8.5 Examples of spatially invariant PSFs.
•
••
.
•
,•• -
•••
, {
'-<1
•
.'
•
!I . •
".
~._-~._
~-' ; ....'.. _',-,= •..
'. .~
a b
Figure 8.6 Examples of blurred images . (a) Small exponential PSF blur;
c d
vp(r, 4»= j
. . <r) 11'<')
up(r+s,4>+6')sdsd9' (8.5)
-<1>0(') I,(Yi
•
where 4>0 (r) is the angular width of the elliptical contour from its major axis, I, (r) and
/2 (r) correspond to the inner and outer ground intercept; of radar pulse width around
• the point at radial distance r, and up(r, 4» and vp(r, 4» are the functions u(x,y) and
. v(x,y), respectively, expressed in polar coordinates. Figure 8.7c shows the effect of
scanning by a forward-looking radar. Note that the PSF associated wil:h (8.5) is not
shift invariant. (show!) ,
••
,
I
I
r Radar beam axis
h
I, -</>
I• ,
I
I I,
I
I
I Antenna half-
power ellipse
,U)¥
I
•
•
• I
••
, J.' ,,,
, ... ~ ," s
,
8; e.
•
•
"'/
• • _0,%
•
f
0
>
"0'
.. , t. •
.(
, •
-':'~ ~-"'t!'<
, ",,--;' 1
• ~>.> ..,
"
".
•
$»' ,
,
, , • •
lrh
to·
> " .
i'i
$-
~""~_"''' M_'~_'_ - .."..... ,._gf~
Ib) object; (el FLR image (simulated).
•
,
Figure 8.7 FLR imaging.
•
Detector and Recorder Models
The response of image detectors and recorders is generally nonlinear. For example,
the response of photographic films, image scanners, and display devices can be
written as
g '" a.w~ (8.6)
'where ex and fl are device-dependent constants and Ii' is the input variable. For
photographic films, however, a more useful form of the model is (Fig. 8.8)
d = 'Y 10glO Ii' - do (8.7)
where 'Y is called the gamma of the film. Here w.represents the incident light
intensity and d is called the optical density. A film is called positive if it has negative
'Y. For 'Y = -I, one obtains a linear model between wand the reflected or trans-
mitted light intensity, which is proportional to g ~ lO-d. For photoelectronic de-
vices, w'represents the incident light intensity, and the output g is the scanning
beam current. The quantityB is generally positive and around 0.75.
Noise Models
The general noise model of (8.3) is applicable in many situations. For example, in
photoelectronic systems the noise in the electron beam current is often modeled as
Vg(x, y) 'Ill (x, y) + '112 (x, y)
'I](x, y) = (8.8)
.
where g is obtained from (8.6) and 'rll and ""2 are zero-mean, mutually independent,
Gaussian white noise fields. The signal-dependent term arises because the detection
•
and recording processes involve random electron emission (or silvergrain deposi-
tion in the case of films) having a Poisson distribution with a mean value of g. This
distribution is approximated by the Gaussian distribution as a limiting case. Since
the mean and variance of a Poisson distribution are equal, the signal-dependent
term has a standard deviation vg if it is assumed that 'TJ1 has unity variance. The
other term, 'TJ., represents wideband thermal noise, which can be modeled as Gaus-
sian white noise.
..
Sec~ 8.2 Image Observation Models 213
•
In the case of films, there is no thermal noise and the noise model is
TJ(x, y) = Vg(x, y) '1'11 (x, y) , (8.9)
where g now equals d, the optical density given by (8:7). A more-accurate model for
film grain noise takes the form.
TJ(X, y) = e( g (x, y»)" TJI (x, y) (8.10)
where E is a normalization constant depending on the average film grain area and 1'1'
lies in the interval j to t. )
The presence of the signal-dependent term in the noise model makes restora-
tion algorithms particularly difficult. Often, in the functionf[g(w)), w is replaced
by its spatial average ....", giving
'I'l(x, y) = /[g(fL.,)]111 (x, y) + 'Il2 (x, y) (8.11)
which makes T1(X, y) a Gaussian white noise random field. If the detector is oper-
ating in its linear region, we obtain, forphotoelectronic ,devices, a linear observa-
tion model ot.the form
v (x, y) = w(x, y) + vf;::'I'l1 (x, y) + 'll2 (x, y) (8.12)
where we have set a = I in (8.6) without loss of generality. For photographic films
. (with 'Y = -1), we obtain .
v (r, y) = -log lO W + a711 (x, y) - do (8.13)
•
where a is obtained by absorbing the various quantities in the noise model of (8.10).
The constant do bas the effect of scaling w by a constant and can be ignored, giving
(8.14)
•
where v (x, y) represents the observed optical density. The light intensity associated
with v is given by
i (r, y) = lO- v(>. y)
'" w(x, y)10-""'(<'>~
= w(x,y)n(x,y) (8.15)
where n ~ 10-a, " now appears as multiplicative noise having a log-normal distribu-
•
non. •
•
•
--./
Inverse Filter
Inverse filtering is the process of recovering the input of a system' from its output.
For example, in the absence. of noise the inverse .filter would be a system that
recovers u(m, n) from the observations vern, n) (Fig. 8.9). This means we must have.
s' (x) = s:' (x), or gf[g(X)) = x (8.21)
hi [m, n; k, I) = h -1 (m, n; k, I) (8.22)
that is,
2:2:
k',l'= _eo
hl(rn, n i k:', l')h(k', 1';k,/)=6(m -k,n -I) (8.23)
Inverse filters are useful for precorrecting an input signal in anticipation of the
•
degradations caused by the system, such as correcting the nonlinearity of a display.
Design of physically realizable inverse filters is diffkult because theX are often
• . , .
I •
I
u(m. n) wlm.n} vlm,n) I ulm,n)
him, n; k, I} gl • I gf( • I h1lm, n; k.il
; "
I
System lnv""" tilter
2:2:
k', I' "" __ee
h' (m - k', n '-1')Jz(k', I') = SCm, n), V(m, n) (8.24)
Fourier transforming both sides, we obtain HI «(,)" w2)H (WI> U>2) = 1, which gives
W(wl>U>2)=H(l ) (8.25)
Ulb W2
. ' " ,
that is, the inverse filter frequency response is the reciprocal of the frequency
response of the given system. However, HI (U>I> U>2) will not exist if H ('wI> U>2) has
any zeros.
Pseudoinverse Filter •
The pseudoinverse filter is a stabilized version of the inverse filter. For a linear shift
invariant system with frequency response H(u>!> wz), the pseudoinverse filter is
defined as .
1
Hi=O •
• •
The Wiener Filter
•
The main limitation of inverse and pseudoinverse filtering is that these filters
remain very sensitive to noise. Wiener filtering is a method of restoring images in
,
the presence of blur as VleU as noise. .
Let u(m; n) and v(m, n) be arbitrary, zero mean, random seq!1ences. It is
desired to obtain an estimate, It (m, n), of u(m,n) from v(m, n) such that the mean
square error
O'~ = E{[u(m, n) - It (m, n)j2} (8.28)
•
_~~~'Y!-:\?!"?/ -"",.,,<! H¥ "F
t_J1!,r)'yF
~- .<
. •
_.'
!'I""--""","" ,
t
t.
f!
t ,.',.
~
o
\
¢'"
·o
• , •
o
\,>"
.~?
'L -~
• - u
,
o •
'.,,: "
•
~. _l~L ~
, .. ' - -""',
(al Original image Ib}8lurred image
-q,1,-~:l«y~y~ ~- ~
r-
"'"' '" ,..., . ../1( "" "' •• q
,,':'
-:~~j
~
,
..... "'~t:--t;-
--
-
'-~\
.-'"
f ,.-' ""/
'.•
•
- ,-.. ~
, '
o
•
,
•
~
• ," 'F";
•
• ""- ""<+",
•
•
•
,i
- "c"
-'.
· .
• ,
- - . '. '" /,_4
-.' - '
l
'j
., -
" ,•
o
---~
"':.-,.~-...- ",",(","'0,,,.,,. ",,",
~
',/ ;, - '
.",
"'0 ,
0" ,,'_
, ,,", <'"
, -
' ,_, ""'" ,\.:r - • ,_ ,_ "0" c,,' ..'.,., ",",' ,he)
-~ ,.....
.
,_." u(min) = LL gim, n i k, l)v(k, I)
- - 'k.,J;:;:;-~ --
-
(8.30)
where the filter impulse response g(m, n; k, l) is determined such that the mean
square error of (8.28) is minimized. It is well known that if u(m, n) and vtm, n) are
•
2:2: gem, n; k, l)r" ik, I; In', n') = r•• (m, n; m " n') (8.33)
k./ =:\ -:;;
Equations (8.30) and (8.33) are called the Wiener filter equations. If the auto-
correlation function of the observed image v im, n) and its cross-correlation with the
object ulm, n) are known, then (8.33) can be solved in principle. Often, u(m, n)
and v (m, n) can be assumed to be jointly stationary so that
(8.34)
for (a, b) = (u, u), (u, v), (v, v), and so on. This simplifies g (m, n; k, 1) to a spatially
invariant filter, denoted by g (m - k, n -'-I), and (3.33) reduces to
where l)(m, n) is a stationary noise sequence uncorrelated with uim, n) and which
has the power spectral density S"w Then
S,y (WI, (2) := IH (w" (2)1 2 Suu (WI, 1I.l2) + S"" (Wj, (2)
. . . (8.40)
SUY (WI, 00:2) := H* (WI' ooz)5•• (WI' 00:2)
This gives
• (8.41l
which is also called the Fourier-Wiener filter. This filter is completely determined
by the power spectra of the object and the noise and the frequency response of the
imaging system. The mean square error can also be written as
Remarks
In general, the Wiener filter is not separable even if the PSF and the various
covariance functions are. This means that two-dimensional Wiener filtering is not
equivalent to row-by-row followed bycolumn-by-column one-dimensional Wiener
filtering. . ' . .
. Neither of the error fields, e(m, n) = u(m, n) - ii(m. n), and e(m, n) ~
v(m, n) - hem, n) ® u(m, n), is white even if the observation noise 7J(m, n) is white.
U = G(V - M.) + M; ,,
S (8.45)
= GV + IH 21S 'll'll slit. - GM~ •
, uu + 1l'l
,
where M(wb W2) f) .9"{f.L(m, n)} is the Fourier translorm of the mean. Note that the
above filter allows spatially varying means for u(m n) and 7](m, n). Only the covar-
iance functions are required to be spatially invariant. The spatially varying mean
may be estimated by local averaging of the image. 1n practice, this spatially varying
filter can be quite effective. If f.Lu and ~ are constants, then M; and M.. are Dirac
delta functions at ~l =S2 = O. This means that a constant is added to the processed
image which does not affect its dynamic range and hence its display.
that is, the phase of the Wiener filter is equal to the phase of the inverse filter (in the
frequency domain). Therefore, the Wiener filter or, equivalently, the mean square
criterion of (8.28), does not compensate for phase distortions due to noisein the
observations.
•
Wiener Smoothing Filter. In the absence of any blur, H = 1 and, the
Wiener filter becomes '
,
Suulw,.O)
Glw,.O)
Inverse
filter
,
Wiene'r
--
• ........
filter
_~ Smoothing
.... ...,...- filter
Relation with Inverse Filtering_ In the absence of noise, we set S"" = 0 and
the Wiener filter reduces to
H*Suu " 1 )
G I5",,_0= IHfSuu =Ei (8.48
which is the inversefilter, On the other hand, taking the limit 51)1)-> 0, we obtain
f.l
lim G =
5",,-0
jH'
0
H=i=O=lr (8.49)
, , H=O
which is the pseudoinverse filter. Since the blurring process is usually a low-pass
filter, the Wiener filter acts as a high-pass filter at low levels of noise.
"
Interpretation of Wiener Filter Frequency Response. When both noise
and blur are present, the Wiener filter achieves a compromise between the low-pass
noise smoothing filter and the high-pass inverse filter resulting in a band-pass filter
(see Fig. 8. lIb). Figure 8.12 shows Wiener filtering results for noisy blurred images.
Observe that the deblurring effect of the Wiener filter diminishes rapidly as the
noise level increases. "
Wiener Filter for Diffraction Limited Systems. The Wiener filter for the
•
continuous observation model, analogous to (8.39)
is given by.
(8.51)
For a diffraction limited system, H(!";" 1";2) will be zero outside a region, say 9l, in the
frequency plane. From (8.51), G will also be zero outside a. Thus, the Wiener filter
cannot resolve beyond the diffraction limit. ,
. ,
~, ,
,
J
,,
';.,. -
~ ,"'~/"'i~j'_.. ~ ...i""J'rr.?"JI
i
____
, , _ __ " ;
.J
,
)
"I <t;#
-
,- '~.
,•
J'
.. ...; cJ
b
..
.
....-....
.\ , ' •..... i"at,v , _ ,t'
."
'-J ""l
g(m,n)=-\ ~~ -!:!.sm,fl<!:!.-l
G(k,l)w-(mHnJ), (8.53)
N (k.J)~-Nr2 . 2 2
where G (k,l) ~ G (21J'kIN, 21J'lIN), W ~ exp(- j21J'IN). The preceding series can
be calculated via the two-dimensional FF]" from which we can approximate gem, n)
by If (m, n) over the N x N grid defined above. Outside this grid g(m, n) is assumed
to be zero. Sometimes the region of support of gem, n) is much smaller than the
N x N grid. Then the convolution of gem, n) with v(m, n) could be implemented .
directly in the spatial domain. Modern image processing systems provide such a
facility in hardwareaUowing high-speed implementation. Figure 8.13a shows this
algorithm.
,
Input image Restored image
A
Vim. nl Convolution ulm, n} •
. glm, nl
•
N N N
M vim, n} 0
N N -Vlk, II N Ulk,
A
n
M VIm, nl 0 0
•
iIrulge N ;
Wlm,n}
windowing , N -
Glk,1I
Glw,. w2}
.
Sample over
!
. NX Ngrid
•
• M
•
M D(m, n)
•
NX N 10FT
and select
•
Theoretically, the Wiener filter has an infinite impulse' response which requires
working with large size DFTs_ However, the effective response of these filters is
often much smaller, than the object size. Therefore, optimum FIR filters could
achieve the performance of IIR filters but with lower computational complexity.
Filter Design
where o? is the variance of the image, and using the factthat z, (k, I) = 'o( -k, -1),
(8.58) reduces to
Cf<
~&(k,I:)+rG(k,I)@a(k,l) eg(k,l)
o (8.63)
= h (k, I) @'o(k, l), (k, l) E W
• ,
The number of unknowns in (8.58) can be reduced if the PSF hem, n) and the
image covariance ru" (m, n) have some symmetry. For example, in the often-
encountered case where hem, n) = h(lmi, In!) and '"u(m, n) "" '"u(lml, In i), we will
have gem, n) = g(jmi, in i) and (8.63) can be reduced to (M + 1)2 simultaneous
equations.
The quantities aik, I) and 'U!' (k, I) can also be determined as the inverse
Fourier transforms of \8 (<OJ, W2W and Suu (<OJ, (02)H' (Wj, (02)' respectively.
When there is no blur (that is, H "" 1), h(k, I) = o(~, l), 'u,' ik, I) = r"u (k, I) and
the resulting filter is the optimum FIR smoothing filter.
The size of the FIR Wiener filter grows with the amount of blur and the
additive noise. Experimental results have shown that FIR Wiener filters of sizes up
to 15 x 15 can be quite satisfactory in a variety of cases with different levels of blur
and noise. In the special case of no blur, (8.63) becomes .
,
Cf2 ,
-;;¥o(k, I) + To (k, I) @g(k, I) = To (k, I), (k, l) E W' (8.65)
. From this, it is seen that as the, SNR -:- o 2/a ; -.. co, the filter response g(k, l)-i>'
o(k, 1), that is; the FIR filter support goes down to one pixel. On the other hand, if
SNR "" 0,' g (k, I) "'" (o- 2/Cf~)ro (k, I). This means, the filter support would effectively
be the same as the region outside of which the image correlations are negligible. For
images with roCk, I) = Q.9S + , this region happens to be of size 32 x 32.
V kZ P
"[;J;{#i>!l
~
\"1!-,
,
,f :
't.
'. '"
i,
''C
~
.;>-
c '. '
J '
s.J4
,/" Cc
,;;"r & f
,iili;.", , _.$
,I"
.A... _ .-<~fr"
~
la) Blurred Image with no noise lbl Restored by,optimaI9'" 9 FIR filter
1 ~:~~--
,1;. "j
.-
,~liltl"
...;iF ,;
,,' >,,;,
<;,:,.;
':~
, ,"
'j"
~'-
})1 ,•
<~-
,/~ty·
'f" l1i- r
_ #' ",5
J;1:. .
.11:".. . '...,
; /.... 'I,'
• ,,~j#'
.
Ie) Blurred Image with small additive noise ldl Restored by 9 x 9 optimal FIR filter
Example 8.4
Figure 8.14 shows examples of FIR Wiener filtering of images blurred by the
•
Gaussian PSF
Ot>O (8.66)
V 2
and modeled by the covariance function r(m, n) "" 0"20.95 ",z + n • If Ot "'" 0, and h(m, n)
is restricted to a finite region, say -ps m, n s p, then the h(m,n) defined above can
be used to model aberrations in a lens with finite aperture. Figs. 8.15 and 8.16 show
examples of FIR Wiener filtering of blurred images tbat were intentionally misfocused
by the digitizing camera.
"
'i'
,," " '\ ~ c",
~
,(
,
/
\",
" <>••
'"
•
j
,
-, ...:l. _'" ' ", '
- ',." -
(al Digitizad image (bl FIR filtered
if
' ~~
-;,- 'f'''-
s- '.; ',*'
,~"
•
.'-
v'
'1
\1:\~
'j:'- , (-'.,
J;'- •
"'-...;~-
1,..;.- ;f
fc!
\
•
',I)
"i
"*,
(c) Digitized image Cdl FIR filtered
The local structure and the ease of implementation of the FIR filters makes them •
attractive candidates for the restoration of nonstationary images blurred byspatially
varying PSFs. A simple but effective nonstationary model for images is one that has
spatially
,
varying mean and variance functions and sbift-invariant correlations, that
IS,
•
E[u(m; n)]:;;; /k(m, n)
E[{u(m, n) - Il-(m, nn {u(m - k, n -l).,. l1-(rn - k, n - 1m (8.67)
•
,,
•
, . ,
, I
'
I
.j
•
•
.~
£,~
T~
,,, .
;di
'If, .\
• !\
,-!
'1%
,..
,.
~4
.~
~; ',- .,"
.'
£
'".]j
!
1
v·
I•
<al Digitilad im~ge
, lb) FIR filtered
-
r#''''''('''-~'':---'''''''''f ~_ "'<:'1 ":?i ~ .-~ '"<} !'/Y"'t ~ ":,' _ "4"'._ : ' / <1ft" I" " >,",.,.1. r, 'if
C4 :-,-!eF'·' '"'>d r ~ r- ,:".-; 'Y'".~ '1"". L·'!'" (' !".;j' ;, ( , ?.' ..... ,-- ~4 '" "".- -K" _ 11 '~~ \1' (;
-~,-- - ~~_ C:1':::'\""" "c~ ~,. ..., n ""","n '""K' ',<. '~T-' ", ,., ," '. ""It'" 'J • ...,.., .... '" / '. "\:: S,~: ~
~" ..",.<,_H
" " >,I'
__, .,...., _'_ __ ',' ~• ""....,."...,..,., ~"'., .~,~. _,.~ ,"" '." " ~
"~'" ~
,'~'" -n --'\ n".Q
n. -\.v··~"-.'_' •.• __
-'-':~- ~<_'~'_I'?')
~f.( '; f."l.<"",r;,'·' .r-rr-.>: ~ -r--
..
(;;./_~. ~~H" ~;_."'''':/ ~!'. __ ~,,'
•
• .. .. • •
- -.-
• ..
,......--
',_'d'_ -"
,..,-..•"_ ,_
',.
-,' -~
.1,
, • • .."- ..
• "
~
-- :-· ..·Ct'
,' .. ' "'~
- -~
• •
----------~._-----
•
":- '"
, . . -- -
where ro (0, 0) = 1. If the effective width of the PSF is less than or equal to W, and h,
IL, and 0- 2 are slowly varying, then the spatially "'arying estimate can be shown to be
given by (see Problem 8.10) •
..
'.
U(m,n)=
. (i,ll
2:2:€ wgm,ll .(i, j)v(m,"': i.n - j) (8.68)
-
gm." (i, j) ~ gm."
.
(i, j) + (2/1/ I)Zll - 2:2: s-: (k, I ) '
't" (k,f) E W
(8.69)
2 2 2
where gm," (i,}') is the solution of (8.64) with 0- = ir (m, n). The quantity 0- tm, n)
can be estimated from the observation model. The second term in (8.69) adds a
.'"',"' ..~
,••
o. .~
•
•,oJ;, ,j
',.'
f> '- •
•
••
,
,,- ,
••
•
, ~-
it
-,
"
,•
(e) Cesar •
(d) Optrnask
- " -",~
r
l1
,J
•
'c"'
,- -'r
,
" .-,""-
, is
_ .. 0fAl'* ~ ~_ '" - lii.:'_"f,~,
I
lai Noisy ISNR = 'OdB) (b) Wiener
••
_ ',-;;
the Casar filter are based on a spatially varying AR model, which is discussed in
•
Section 8.12. Comparison of the spatially adaptive FIR filter output with the con-
ventional Wiener filter shows a significant improvement. The comparison appears
much more dramatic on a aood quality CRT display unit compared to the pictures
printed in this text. .
•
The Wiener filter of (8.41) motivates the use of other Fourier domain filters as
discussed below.
This filter is the geometric mean of the pseudoinverse and the Wiener filters, that is,
" ( S H* los
G u2 F
s: 1I2
IHH-II12 exp{-j6n} (8.71)
z
IHl Suu + S'I'I
where 6n(WhWZ) is the phase of H(WhWZ)' Unlike the Wiener filter, the power
spectral density of the output of this filter is identical to that of the object when
H '1= O. As S~~ --l> 0, this filter also becomes the pseudoinverse filter (see Problem
8.11 ).
Nonlinear Filters
, '" ,,,.",g~l~j
(8) a ~ 0.8 lb) a =,0.6
.' i •
,; ; ,, '
Figure 8.19 Root filtering of FLR image of Fig. 8.7c.
•
,
Sec. 8.5 .; ether Fourier Domain Filters ' ' , 291
(having small energy at, high spatial frequencies) because the samples of low
amplitudes of V (WI> w,) are enhanced relative to the high-amplitude samples. for
a s- 1, the large-magnitude samples .are amplified relative to the' low-magnitude
ones, giving a low-pass-filter effect for typical images. Another useful nonlinear
filter is the complex logarithm of the observations
. (;(w uh) = {log VI}exp{j6 v}, [V!;z e. (8.74)
lr. 0, otherwise
where E is some preselected small positive number. The sequence u(m, n) is also
called the cepstrum of v(m, n). (Also see Section 7.5.)
Remarks.
If the noise power goes to zero, then the Wiener filter becomes the pseudoinverse
of $(see Section 8.9), that is, ..
'*
The Wiener filter is not separable (that is, /7 G 1 ® G 2) even if the PSF and
the object and noise covariance functions are separable. However, the pseudo-
inverse filter /7- is separable if!7l and $ are.
The Wiener filter for nonzero mean random variables u and R- becomes
,
u;: p,,, + /7 (0- - p,~) = /70- + [~!7l~I$ + !7l-lr l m;-Ip... -/7p,n (8.87)
This form of the filter is useful only when the mean of the object and/or the noise
are spatially varying.
= [Diag{v.t.9l"T<9l"~*I) + <T~(Diag{v.tGfvt*I)-lrl
l.
where we have approximated Diag{~Gf-I~.(*1J by [Diag{v.tGf.-ut*1Jr This gives
A ,
vim, nl ...[) :m~ge
. w(k,n w(~. (l 2-D inverse . ,,(m. nl
hl-m, -n) X
. tr~n.form transform
. . .
•
•
where n (x) represents the errors (or noise) in the observation process. Smoothing
splines fit a smooth function g (x) through the available set of observations such that
its "roughness," measured by the energy in the second derivatives (that is,
2 2
f[d g(x)/dx j2 dx), over [XO,XN] is minimized. Simultaneously, the least squares
error at, the observation points is restricted, that is, iotg, ~ g (Xi)' . '
N 2
F~ 2: gi - Yi $ S (8.97)
•
.-
. 0 17..
•
•
• •
For S = 0, this means an absolute fit through the observation points' is required.
•
Typically, o : is the mean square value of the noise. and Sis chosen to lie in the range
(N + 1) +- V2(N + 1), Whilll is also called the confidence jnterval of S. The minimi-
zation problem has two solutions: .
•
1. When S is sufficiently large, the constraint of (8.97) is satisfied by a straight-
line [it .
g(x) = a + bx, Xo <x <x",
(8.98)
' = (/Lx; - /Lx ....Y)
IJ 2) , . a = l.\.y - b /L.J'
(/Lxx - ....x '
ill
where /L denotes sample aver~ge, for instance, /Lx A (:to X;)/(N + 1), and so on .
•
2. The constraint in (8.97) is tight, so only the equality constraint can be satis-
fied. The solution becomes a set of piecewise continuous third-order
polynomials called cubic splines, .given by
g(x) = a, + bj(x - x;) + c;(x - X;)2 + dj(x - xY, Xj:$X <Xi .. ! (8.99)
OSi:$N-1
where a and yare (N + 1) x 1 vectors of the elements {ai, 0:$ i :$ N}, {Yj, 0 0$ is N},
respectively, and c is the (N -1) x 1 vector of elements [c., 1 Si -s N -1]. The
matrices Q and L are, respectively, (N - 1) x (N - 1) tridiagonal Toeplitz and
(N + 1) x (N - 1) lower triangular Toeplitz, namely,
1
-2
QAt! , L~.! 1 (8.101)
-3 -h
•
0
,
~ , •
and P tl (1~ LTL. The parameter i\ is such that the equality constraint
,
• , ' 2
F(A) ~ Iia -/11 = v T APAv = S (8.102)'
sr;
•
Remarl:s
The solution of (8.99) gives gj 4 g(x;) = a., that is, a is the best least squares estimate
of y. It can be shown that a can also be considered as the Wiener filter estimate of
y based on appropriate autocorrelation models (see Problem 8,15). .. .
'. The special case S = 0 gives what are called the interpolating splines, where the
splines must pass through the given data points ..
.'
Example 8.6
Suppose the given data is 1, 3, 4, 2, 1 and let h = 1 and (J'" "" 1: Then N = 4,xo = 0,
. 'x."" 4. For the case of straight-line fit, we get I'-x = 2,ftY "" 2.2,I'-":Y = 4,2, and 11= = 6,
which gives, b "" -0.1, a ,= VI,. and g(x) "" 2A - O.Ix.' The least square error
" ~i (Yi - g,)2 = 6.7. The confidence interval for Sis [1.84,8.16]. However, if Sis chosen
to be less than 6.7, say S = 5, then. we have to go to the cubic splines. Now
•
Solution of F(A) - S = 0 gives ~ = 0.0274 for S == 5. From (8.100) we get
r-
2.199
2.397
0.198
0.057
r 0.000 -,
-0.033
0.000
-0.005
a= . 2.415 , b= -0.194 , c"" -0.049 , d'" 0.009
2.180 -0.357 -0.022 0.007
1.808.,.J 0.000 L 0.000 0.000
In the previous section we saw that the smoothing splines solve a least squares
minimization problem. Many problems in linear estimation theory can also be
reduced to least squares minimization problems. In this section we consider such
problems in the context of image restoration.
Consider the spatially invariant image Observation model of (8.39). The constrained
• •
. ~ . u
least squares restoration filter output (m, n~, which is an estimate of u(m, n),
.
muurmzes a quantity
J ~ IIq(m, n)@u (m, n)W (8.104)
and q(m, n) is an operator that measures the "roughness" of it (m, n). For example,
if q(m, n) is the impulse response of a high-pass filter, then minimization of J
implies smoothing of high frequencies or rough edges. Using the Parseval relation
this implies minimization of
J = IIQ(OObWz)U whwz)IF
, (8.107)
subject to
, z Z I
IIV (WI. Wz) ~ H (Wb wz)U (W1> wz)ll :S e (8.108)
The solution obtained via the Lagrange multiplier method gives
,
U(001> wz) = Gis (wt, WZ)V(WhWz) (8.109)
G A H*(WhWZ)
(8.110)
Is == IH (Wh wz)f2 + "YIQ (Wh l1}z)P
,
The Lagrange multiplier '( is determined such that U satisfies the equality in
(8.108) subject to (8.109) and (8.110). This yields a nonlinear equation for "/:
41VI z
A.::L
IQIf --
!('Y)=4'ITz -l (IJIlz+'YIQIZ)zdwldw2
_ 2_
e-O (8.111)
In least squares filtering, q(m, n) is commonly chosen as a finite difference approxi-
mation of the Laplacian operator iffiJx 2' + flfa y 2. For example, on a square grid with
spacing Ax = Ay =: 1, and IX ':"! below, one obtains
. q(m, n) ~ -oem, 12) + a[o(m - 1, n) + oem + l,n) •
(8.112)
+ SCm, n - 1) + oem, n + 1)]
Remarks
;" = (~T..'/'+9CT$ )- l $ T 0-
ill[.9r:'[$?T$?t l.9r:'T
+ 'YIrl o-W - e=0 Z (8.115)
""\
~GENERALIZEDINVERSE, SVD, AND ITERATIVE METHODS
The foregoing least squares and mean square restoration filters can also be realized
by direct minimization of their quadratic cost functionals, Such direct minimization
· techniques are most useful when little is known about the statistical properties of
the observed image data and when the PSF is spatially variant. Consider the image
observation model
v=Hu (8.117) .
The Pseudoinverse
In general, whenever the rank of H is r < N. (8.119) does not have a jmique
solution. Additional constraints on u are then necessary to make it unique.
•
A vector u" that has the minimum norm l/u!i1 among all the solutions of (8.119) is
called the MNLS (minimum norm least squares) solution. Thus
u" = m}n{IIt1!12; HTHii = H Tv} (8.124)
u
Clearly, if rank [HTH] = N, then u" = uis the least squares solution. Using the
singular value expansion of H, it can be shown that the transformation between v
and u" is linear and unique and is given by
(8.125)
,
The matrix H+ is called the generalized inverse of H. If the M x N matrix H has the
SVD expansion [see (5.188»)
r
.H = 2: A~$mq,~ (8.126a)
m~l
r
., H+ = 2: A-1/2.l.o/m't'm
rtf
.II
• (8.126b)
m "" 1
where q,m and \jim are, respectively, the eigenvectors of HTH and HH T correspond-
ing to the singular values {Am, 1:5 m :5 r},
Using (8.126b),
,
it can be shown Hot satisfies the following relations:
1. g+ sa (HTH)-l H T, ifr =N
2. H+ == H T (HH T r 1, ifr=M
3. HH+=(HH?
4. g+H = (H+H)T
5. HH+H=H
• •
6. H+HHT=HT
•
(8.127)
This method is quite general and is applicable to arbitrary PSFs. The major
difficulty is computational because it requires calculation of \jIk and q,k for large
300 •
•
/
Image Filtering and Restoration· . Chap. S
matrices. For example, for M = N = 256, H is 65,536 x 65,536, For images with
separable PSF, that is, V "'= HI UHz, the generalized inverse is also separable, giving
U+=HtVH!. .
When the ultimate aim is to obtain the restored solution u", rather than the explicit
•
pseudoinverse H~, then iterative gradient methods are useful. One-step gradient
algorithms are of the form . .
•
Un + 1 = u, - an g., Uo = 0 (8.128)
g. ~ -H (V - HUn) = g.-I - a n - ! HTHgn _ 1
T
(8.129)
where u, and g. are, respectively, the trial solution and the gradient of J at iteration
step n and an is a scalar quantity. For the interval 0 ~ Un < 2IAm" (HTH), u, con-
verges to the MNLS solution u" as n ~ 00. If o; is chosen to be a constant, then its
optimum value for fastest convergence is given by [22] .
2
H"o;:'T-Hcc::)J
a opt = A-ma- x -=(H='T:;:-:H=)-+-Am-in--:(....
-=-[
. (8.130)
•
For a highly ill conditioned matrix HTH, the condition number, Am"/Amin is large .
.Then uopt,is close to its upper bound, 2IAmax and the error at iteration s, for o; = sr,
obeys ~
(8.131)
This implies that lien II is proportional to lien -! II, that is, the convergence is linear and
can be very slow, because a is bounded from above. To improve the speed of
convergence, iT is optimized at each iteration, which. yields the steepest descent
algorithm, with
(8.132)
However, even this may not significantly help in speeding up the convergence when
the condition number of A is high.
From (8.128) and (8.129), the solution at iteration n = i, with an = a can be written
as
•
where G r is .a powerseries:
j ,
.. G;=a L (l-aBTH)kH T (8.133)
k=O
One advantage of this method is that the region of support of each filter stage
is only twice (in each dimension) that of the PSF. Therefore, if the PSF is not too
broad, each filter stage can be conveniently implemented by an FIR filter. This can
be attractive for real-time pipeline implementations.
The scalars a, and vectors di can' be calculated conveniently via the recursions
> • •
. d A in d n ... A o·
Un+l=Un+llI.n n, a n = - drAd Uo=
• n'
dn=-gn+~n-ldn-h Q
P .. -. l A_ Tin Adn-1 , • (8.136)
dn - I Adn _ 1
•
g. = -HTv + AUn = gn-,I + (In-1Adn-h ~ ~ -H V T
vfm, nl
oH'lw" w21 - l-alHlw"w2 1l' , .. 1-oIH(w"w2 112 • •
1 • • .. .... • . .. i
Example 8.7
Consider the solution of
.
whose eigenvalues are Al = 10 + V65 and A2 = 10 - V65. Since A is nonsingular,
R" _A- H
. .
1 T =.L [
35
14
-7
-67 1 ][2 21 31] [05 =.L
35
21 .-7]
-8 '11
,
•" '
This gives •
and
go= [-In .
VI. '"
[0.8]
1.3 ' J, =9.46
The conjugate gradient algorithm converges in two steps, as expected, giving U2 = u".
Both conjugate gradient and steepest descent are much faster than the one-step
. gradient algorithm;
•
. Separable Point Spread Functions
•
algorithms. For example, the matrix form of the conjugate 'gradient algorithm
becomes
-(G;, Dr.) u"
0: - 0=0
tt - (Dn , A, DnA 2 ) '
R _ (Gn.A1Dn-1A2)
Dn=-G.+l3n-1 Dn - b 1-'.-1-( (8.137)
, D n - hAI Dn- l A2 )
G. '" Gn - 1 + 0:.-1 A1Dn - 1 A z, Do'" Go = HfVHz
where Al ~ HfH h A z = H~H2 and (X, Y) ~ Lm LnX(m, n)y(m, n). This algorithm
has been found useful for restoration of images blurred by spatially variant PSFs
, [24].
Recursive filters realize an infinite impulse response with finite memory and are
particularly useful for spatially varying restoration problems. In this section we con-
sider the Kalman filtering technique, which is offundamental importance in linear
estimation theory.
Online filter. The online filter is the best estimate of the current state based
on all the observations received currently, that is, .
,
g. ~ E[xn!Yn" 0 <n 1:$ n]
(8.145)
g,. - n n qn V n + Sn
-R CT -I
This estimate simply updates the predicted value s, by the new information P n with a
gain factor Rn~q;;l,to'obtain the most current estimate. From (8.140), we now see
that .
• •
(8.146)
Y. + Itn + Unit s. •
+ G q-l + delay
•• r'
- -
•
A"
c•
•
[b] One step predictor
Yn + It.
R CT q-l
+
+ •• • +. g". filter
- +
tn. predictor -+-:....
Update.
Predict
Cn $" s.
C. ·z -1 A. •
Therefore, Kalman filtering can be thought of as having two steps (Fig. 8.22c). The
first step is to update the previous prediction by the innovations. The next step is to
predict from the latest update.
where y(n),n = 1,2, ... represents one scan line. The PSF hen, k) is assumed to be
spatially varying here in order to demonstrate the power of recursive methods. Assume
that each scan line is represented by a qth-order AR model
u(n) '"
•2: a(k)u(n - k) + ten), E[t(n)E(n')] '" 13z3(n - n') (8.150)
Without loss of generality let II + Iz:z. q and define a state variable x" = [u(n + II) ...
u(n + 1), u(n) ... u(n -lzW, which yields an (/1 + Iz + I)-order state variable system •
and e. ~ e(n + I, + 1). This formulation is now in the framework of Kalman filtering:
•
and the various recursive estimates can be obtained readily• Extension of these ideas to
•
.wo -dunensional blurs is also possible. For example, it has been shown that an image
blurred by a Gaussian PSFs representing atmospheric turbulence can be restored by a
Kalman filter associated with a diffusion equation [66].
•
, (8.154)
, = UN + 1In
•
Y. (8.157)
Equations (8.155) and (8.157) are now in the proper state variable form to yield the
Kalman filter equations .
•
Sn+l =Lg." ,
. ,
-
R, = (I - R, (R, + (J"~ IfIJRn = (J"; (R, + (J"; It IR, .'
Using the fact that L = L11 L2 , this gives the following recursions..
,
, Observation update: N"
Prediction:
=? s irn, n + 1) '= flls(m-l, n + 1) + P2g(m, n)
•
(~.160)
•
-r- P3g(m -1, n) +p4g(m + 1,n)
where k; (m, i) are the elements of K n ~ R, (R~ + (J"~ I)-I. The Riccati equation can
. T A
be Implemented as L 1 R, + 1 L I = R n , where '
~Ll = L - T + P, R,
- = [ 1- K JR, (8.161)
K n zRnL 2 n
•
This gives the following:
Forward recursion:'
L1 Rn+ 1 = Qn =? 'n+l (i,j) "" iVn + I (i - I,j) + qn (i,j) (8.162)
Backward recursion:
QnLl = lin =? qn (t, j) = Plqn (i,j + 1) + tn (i, j) (8.163)
, '
This means that Kn> defined before, is the one-dimensional Wiener filter for the
noisy vector 11n and that the summation term in (8.159) represents the output of this
Wiener filter, that is, K" 11n = en> where en is the best mean square estimate of en
given V n • This gives
, gem, n) = sCm, n) + (m, n) (8.165) e
e
where (m, n) is obtained by processing the elements.of the Vn. In (8.159), the
estimate at the nth column is updated as Yn arrives. Equation (8.160) predicts
recursively the next column from this update (Fig. 8.23). Examining the preceding
,
I
,, ,, ,, I
Column I
tim. n) I
delay
, Z2
~1
iI
I .
y(m, nl
- + I
t(m.n+ll
• gim, n) . I• 2·D
+ + filter
,
+ , I
vIm; n) + •
•
I
• I
" I •
I tim," + 11 ~ p,s(m -\, 'n + 1) + P2u(m, nl
t-n I
Wiener I - psu(m - 1, nl + p.g(m + 1, nl
filter •
I • ' . ,
k. (m, il I •
.
I
h I
elm, nl ,
j
•
- .~" •
.~ - ~ -_.- , -<,'
"
•
I•
I
•
Update . i •
Predict
Figure 8,23 Two-dirnensional recursive filter.
Stationary Models
•
For stationary models and large image sizes, R, will be nearly Toeplitz so that the
matrix operations in (8.158) can be approximated by convolutions, which can be
implemented via the FFT. This will reduce the number of operations to (0 (N 10gN)
per column or o (logN) per pixel.
Steady-State Filter
. ' .
For smooth images, R, achieves steady state quite rapidly, and therefore K" in
, (8.159) may be replaced by its steady-state value K. Given K, the filter equations
need O(N) operations per pixel.
If the steady-state gain is used, then 'from the steady-state solution of the Riccati
equation, it may be possible to find a low-order, approximate state variable model
for e(m, n), such as
•
(8.166)
e(m, n) = Cx; . •
where the dimension of the state vector Xm is small compared to N. This means that
for each n, the covariance matrix of the sequence e(rn, n), m = 1, ... ,N, is approxi-
mately R = lim Rn • The observation model for each fixed n is
n_<»
In practice, the updated value etm, n) depends most strongly on the. observations
[i.e., v(rn, n)] in the vicinity of the pixel at (rn, n). Therefore, the dimensionality of .
e
the vector recursive filter can be reduced by Constraining (rn, n) to be the output of
a one-dimensional FIR Wiener filter of the form
•
where rem - k) are the elements of R, the Toeplitz covariance matrix of en used
previously. Substituting (8.165) in (8.160), the reduced update recursive filter be-
comes
sCm, n + 1)·"; Pls(m -l,n + I} - P3s(m -l,n) + pzs(m,n)
•
(8.170)
•
where e(m, n) is given by (8.168). A variant of this method has been considered in
(36].
Remarks
The recursive filters just considered are useful only when a causal stochastic model
such as (8.151) is available. If we start with a given covariance model, then the FIR
Wiener Filter discussed, earlier is more practical.
sequences. Such models yield semirecursive filtering algorithms, where each image
column is first unitarily transformed and each transform coefficient is then passed
through a recursive filter (Fig. 8.24). The overall filter is a combination oi fast
transform andrecursive algorithms. We start by writing the observed image as
• •
P2 q, . :
v(m,n)= 2: L h(k,l)u(m -k,n -1) + Tj(m, n),
k~-P11~ -ql (8.171}
1 Sm sN, n ""0,1,2, ...
•
h
y.(1) "'nO)
• .
•
RecuIllive filter
, •
.
,,
• •
• • •
h
Vn vn(k) «; (kl
A
Un
• lit Recursive filter lit-I ,
,•
.
•
, •
<
• •
• h •
•
• vn(Nl x.(NI
• . < , •
.
Recursive filter
- , , ~
• • . .
•
(8.172)
- -
.where v., Un, b., and 1). are N x 1 vectors and b. depends only on u( -Pz + 1, n), ..
u(O, n), u(N + 1, n}, u(N + PI> n), which are boundary elements of the nth column
of u (m,n). The HI are banded Toeplitz matrices.,
Filter Formulation
Let W be a fast unitary transform such that WHn W*T, for every fl, is nearly
diagonal. From chapter 5 we know that many sinusoidal transforms tend to diago-
nalize Toeplitz matrices. Therefore, definizu; ,
Li .v. Li .v. - Li lVrb- Li ....
Y. "" "r Vn>,
Xn "" ,.. Un> Cn = ~ ., , V n "" 'f." Yin
In most situations the image background is known or can be estimated quite accu-
rately. Hence cn(k) can be assumed to be known and can be absorbed in Yn (k) to
give the observation system for each row of the transformed vectors as
qz
Yn (k) =
. 1=
2.:-q, 'Y1(k)x.~,(k) + v. (k), k=l, ... ,N (8.175)
x. (k) ==
I~
2.:1adk)x.-J(k) + en (k), k=l, ... ,N (8.176)
which together with (8.175) can be set up in the framework of Kalman filtering, as
shown in Example B.8. Alternatively, each line [Yo (k), n = 0,1 ... J can be pro-
cessed by its one-dimensional Wiener filter. This method has been found useful in
adaptive filtering of noisy images (Fig. 8.25) using the cosine transform. The entire
image is divided into small blocks of size N x N (typically N == 16 or 32). For each k,
the spectral density Sy(00, k), of the sequence [y.. (k), n = 0, 1, ... ,N - 1], is esti-
mated by a one-dimensional spectral estimation technique, which assumes {Y.} to
be an AR sequence [81. Given S, (00, k) and cr;, the noise power, the sequence Y. (k)
• •
x.
is Wiener filtered to give (k), where the filter frequency response is given by
G(w, k) ~ Sx (00, k) , = Sf (00, k):- (j.~ (8.177)
Sx(w,k) +<1; . Sy(w,k) .
In practice, S. (00, k) is set to zero if the estimated S, (00, k) is less than (j~. Figures
8.17 and 8.18 show examples of this method, where it is called the COSAR (cosine-
AR) algorithm. ' .
Inverse ,
"n Cosine Vn(kl 1·0 xn (kl cosine un
transform Wiener transform
•
W filter '!oT
,,
Figure 8.25 CaSAR algorithm for adaptive filtering using semicausal models.
Speckle Representation .
In free space, speckle can be considered as an infinite sum of independent, identical
phasors with random amplitude and phase [41, 42]. This yields a representation of
its complex amplitude as
(8.178)
where an and a{ are zero mean. independent Gaussian random variables (for each
.x, y) with variance (1~. The intensity field is simply
•
s =s(x,y) = la(x,y)12=a~ + a; (8.179)
which has the exponential distribution of (8.17) with variance (1 2 ~ 2(1~ and mean
l-\o, = E[s] = (12, A white
, noise field with these statistics is called the fully developed
speckle. .
For.any speeklerthe contrast ratio is defined as .
•
standard deviation of s
'Y = .' mean value of s (8.180)
(a)
';:
fAA i;td'- -, ;-, , ".I.
(b)
·'
O
:r>·······..\
1\.
". -
•
• II
il II Jill JII III ..
•
l' . .
'«
• • •
•
(e)
where 'T](x, y) is the additive detector noise and <j:l(x, y) represents the phase dis-
tortion due to scattering. If the impulse response decays rapidly outside a region
Reel! (x, y), called the resolution cell, and g (x, y) is nearly constant in this region,
then [44]
v(x, y) "" Ig(x, y)l~ la(xt Y)1 + ll(x, y) 2
(8.182)
= u(x, y)s(x,y) + Tj(x, y)
where •
•
where SN (x, y) is the N -look average of the speckle fields. This is also the maximum
likelihood estimate of[v/(x,y),l = 1, ... ,N], which yields
2 2
' ] !J.s u [.] u
E[ VN = N' val' VN = NIX, (8.186)
•
This gives the contrast ratio 'I = liN for VN. Therefore, the contrast improves by a
factor of VN for N -look averaging.
• •
. WN(X,y) ± z(x, y) + '11N(X, y) (8.188)
•
,f
, ,,' .
• .-
-
-
•
- ," d
-~- .
, ---
• •
" ...
•
.-
•
-- ,,- , -
-•
For N ~2, 1]N can be modeled reasonably well by a Gaussian random field
[45J, whose spectral density function is given by
• 6' N=l
S" (£1' £2) = cr~ = 1 (8.189)
-N'
Now z(x, y) can be easily estimated from WN(X, y) using Wiener filtering tech-
niques. This gives the overall filter algorithm of Fig. 8.27, which is also called the
homomorphic filter. Experimental- studies have shown that the homomorphic
Wiener. filter performs quite well .eompared to linear filtering or other homo-
morphic linear filters [46]. Figure 8.27 shows the performance of an adaptive FIR
• Wiener filter used in the homomorphic mode.
•
•
•
8.14 MAXIMUM ENTROPY RESTORATION
-
The inputs, outputs, and the PSFs of incoherent imaging systems (the usual case)
,,' are nonnegative. The lelist squares or mean square criteria based restoration algo-
rithms do not yield images with nonnegative pixel values. A restoration method
based on the maximum entropy criterion gives nonnegative solutions. Since entropy
is a measure of uncertainty, the general argument behind this criterionIs that it
• assumes the least about the solution and gives it the maximum freedom within the
•
limits imposed by constraints.
, (8.190)
where !Te is the PSF matrix, and a. and o- are the object and observation arrays
mapped into vectors, a maximum entropy restoration problem is to maximize
(8.191)
•
subject to the constraint
~II{; - %",11 2
= <T~ (8.192)
°
where <T~ > is a specified quantity. Because II (n) is nonnegative and can be
normalized to give:En tI (n) = 1, it can be treated as a probability distribution whose
entropy is (} (tb). Using the usual Lagrangian method of optimization, the solution
u is given by the implicit equation '
" r u- = exp{-l- A!TCT ({; -!Tea)} (8.193)
"
where expjx} denotes a vector of elements explx(k)J, k = 0, 1, ... ,1 is a vector of
allls and A is a scalar Lagrange multiplier such that i.. satisfies the, constraint of
(8.192). Interestingly, a Taylor series expansion of the exponent, truncated to the
first two terms, yields the constrained least squares solution
u- = (.9·P 9'C + Al)-l%T (F (8.194)
Note that the solution of (8.193) is guaranteed to be nonnegative. Experimental
results show that this method gives sharper restorations than the least squares filters
. when the image contains' a small number of point objects (such as in astronomy
images) [48J.. '
A stronger restoration result is obtained by maximizing the entropy defined by
(8.191) subject to the constraints
, ,
,
,
" a (n) ~O, n = 0, ... ,N -1
" N~l (8.195)
2: It (m, DIU (j) '" (>- (m), m '" 0, ... ,M -1
•
j=O
) ,
a. (n) = 1 exp
M ~ 1 1
Lh (l,n)A(l) , n = 0, ... , N - 1
•
(8.196)
. e J. /=0.
•
. where A(l) are Lagrange multipliers (also called dual variables) that maximize the
functional .
N-l M-l
J(A) ~L
n~O
a(n) - '1-0'
L ~(l)(r(l) '
(8.197)
The above problem is now unconstrained in A(n) and can be solved by invoking
several different algorithms from optimization theory. One example is a coordinate
ascent algorithm, where constraints are enforced one by one in cyclic iterations,
giving [49] .
O'(m)
N-I
2: h (m, k).X"j-l (k) (8.198)
k=O
•
where m = j moduloM, k = 1, ... , Nand j '= 0, 1, .... At the jth iteration, x;(n)
is updated for all n and a fixed m. After m = M, the iterations continue cyclically,
updating the constraints. Convergence to the true solution is often slow but is
assured as j ..... 00, for 0 S A (m, 11) S 1. Since the PSF is nonnegative, this condition is
easily satisfied by scaling the observations appropriately.
• •
Log-Entropy Restoration
subject to the constraints of (8.195). The solution now is obtained by solving the
nonlinear equations
•
(8.201)
Once again an iterative gradient or any other suitable method may be chosen to
maximize (8.201). A coordinate ascent method similar to (8, 198) yields the iterative
solution [50]
• () -
Ibj+1 n -1
a/en) d .
( ) . ()' m »] mo uloM, n=O,I, ... ,N-l
,
(8.202)
+ IXjh m, n Uj n
where IX; is determined such that the denominator term is positive and the constraint
N-l
2:
n=O
h (m,n)aj+ I (n) = If (m) (8.203)
• •
is satisfied at each iteration. This means we must solve for the positive root of the
nonlinear equation . •
As before, the convergence, although slow, is assured as j - For tl (m, n) > 0, 00.
which is true for PSFs, this algorithm guarantees a positive estimate at any iteration
step. The speed of convergence can be improved by going to the gradient algorithm
[51]
Aj+l (m) = Aj(m) + o.jgj(m), j = 0,1, ... (8.205)
N-I
gj(m) = t.' (m) - 2: If (m,.n)u,j(n) (8.206)
n=O
1 (8.207)
11/-1
2: Ii (m, n)Aj (m)
m=O
where Xo(m) are chosen so that tlo(n) is positive and (Xj is a positive root of the
•
equation
N-I
f(o.j)~ 2: Gj(k)[Aj(k)+o.jGj(k)tI=O (8.208)
k~O
•
where
11/-1 . 11/-1
Gj(k) A 2: Ii (m, k)gj(m), Aj(k) = 2: If (m, k)Aj(m) (8.209)
m~O m~O
•
The search for o.j can be restricted to the interval [0, max{Gj (k)/Aj(k)JJ.
This maximum entropy problem appears often i~ the theory of spectral esti-
mation (see Problem 8.26b). The foregoing algorithms are valid in multidimensions
if a. (n) and o- (m) are sequences obtained by suitable ordering of elements of the'
multidimensional arrays u (i, i, ...) and v (i, i, ...), respectively.
~
/
.
8.1~\ BAYESIAN METHODS
f
." -.-..../
In many imaging situations-for instance, image recording by film-the observation
model is nonlinear of the form '
,
, (8.210)
where f(x) is a nonlinear function of x. The a posteriori conditional density given by
Bayes'rule
,, (8.211)
MAP estimate, iec.\IAP: iec MAP= Po" + fJl",.'7P .01'lt;;l[U- - !(9'CuMAP)] (8.214)
where p.~ is the mean of Ib and .0 is defined in (8.213) but now ~ 91riec MAP' w-
Since these equations are nonlinear, an alternative is to maximize the appro-
priate log densities. For example, a gradient algorithm for iec MAPis
,
Wj + 1== iecj - Ctj{$fg;ja;;l(o - f(9'Ciec j) ] -:- 91:1[iecr - fL,,]} , ,(8.215)
where Ctj> 0, and0 j is evaluated at W-j Li 9lu,j.
Remarks
If the function f(x) is linear, say I(x):= x, and91n := O'~ I, then a.MI. reduces to the
least squares solution
. ' .
, 9C 9CU'ML = 9(' o- (8.216)
and the MAP estimate reduces to the Wiener filter output for zero mean noise [see
(8.87»),
a'MAP = fL", + lJ (0 - p.~) (8.217)
•
where /}',;, (91: 1+.'7P !ill. 1 9'Ct l .!7P{II.:l .
In practice, fL" may be estimated as a local average of o- and fL" "" .9'C"F 1 (fL~),
where.9'C" is the generalized inverse of 9'C. .'
•
•
. 8.16' COORDINATE TRANSFORMATION
.• AND GEOMETRIC CORRECTION
,
,
transformation between the two coordinate systems. Common examples of geo-
metric transformations are translation, scaling, rotation, skew, and reflection, all of
which can be represented by the affine transformation
• (8.218)
•
In principle, the image function in (x' l y') coordinates can be obtained from its
values on the (x;,y;) grid by an appropriate interpolation method followed by
resampling on the desired grid. Some commonly used algorithms for interpolation
at a point Q (Fig. 8.28) from samples at PI , P2 , P3 , and P4 are as follows..
1; Nearest neighbor:
For many imaging systems the. PSF is spatially varying in Cartesian coordi-
.nates but becomes spatially invariant in a different coordinate system, for example,
in systems with spherical aberrations, coma, astigmatism, and the like [56, 57J.
, . These and certain other distortions (such as that due to rotational motion) may be
, , •
... ,.'
•
where n 2: 1 and (r, 9) are the polar coordinates. The .PSF is spatially varying in
(x, y). In (r, 9) it is shift invariant in e but spatially variant in r. Under the loga-
rithmic transformations
E= In r, Eo = n In ro (8.223)
the ratio r/rfJ becomes a function cf the displacement £- ~ and (8.222) can be
written as a convolution integral: .
(8.224)
where
Spatially invariant filters can now be designed to restore /(k90) from 8(~,e).
Generalizations of this idea to other types of blurred images may be found in [56,
57]. .
The comet Halley shown on the front cover page of this text was reconstructed
from data gathered by NASA's Pioneer Venus Orbiter in 1986. The observed data
was severely distorted with several samples missing due to the activity of solar
flares. The restored image was obtained by proper coordinate transformation,
bilinear interpolation, and pseudocoloring.
•
•
where Vk and U«. k '" 1, ... ,M are obtained by dividing images v(m, n) and u(m, n)
into M blocks and then fourier transforming them. Therefore, identification of H
requires power spectrum estimation of the object and the observations. Restoration
methods that are based on unknown H are called Hind deconvolution methods.
Note that this method gives only the magnitude of H. In many imaging situations
the phase of H is zero or unimportant, such as when H represents average atmos-
pheric turbulence, camera misfocus (or lens aberration), or uniform motion (linear
phase or delay). In such cases it is sufficient to estimate the MTF, which can then be
used in the Wiener filter equation. Techniques that also identify the phase are
possible in special situations, but, in general, phase estimation is a difficult task.
Analytic Continuation
A bandlimited signal f(x) can be determined completely from the knowledge of it '
over an arbitrary finite interval [-a, «]. This follows from the fact that a band-
limited function is an analytic function because its Taylor series
(8.227)
is convergent for all x and A. By letting x E [-a, o] and x + A > a, (8.227) can be
used to extrapolate f(x) anywhere outside theinterval [-0:,0:]. •
•
Super·resolution •
• •
The foregoing ideas can also be applied to a space-limited function (Le.,f(x) = 0 for
IxI> 0:) whose Fourier transform is given over a finite frequency band. This means,
theoretically, that a finite object imaged by a diffraction limited system can be
perfectly resolved by extrapolation in the Fourier domain. Extrapolation of the
spectrum of an object beyond the diffraction limit of the imaging system is called
super.resolution.· . . i •
The high-order derivatives in (8.22"7) are extremely sensitive to noise and truncation
errors. This makes the analytic continuation method impractical for signal extrapo-
lation. An alternative is to evaluate f(x) by the series expansion
2
f"I/(X) - g(x)1 dx = [lFm - GO(~)12 d~ = J:IF(~) - F; (~)12 d~
, •
+ II>.IGo(~Wd~> flFm-F;mI2d~=
-.
rl/(x)-:Mx)1 dx
-
2
(8.230)
,
Now II (x) is bandlimited but does not match the observations over [-a,a]. The
error energy is reduced once again if II (x) is'substituted by I(x) over -a::S; X :S a.
Letting J denote, this space-limiting operation, yve obtain
gt (x) ~/I (x) - J/t (x) + go'(x) (8.231)
and
(8.232)
..
•
f, (xl F,W
("""'\
u,
~
(xl
f, (xl ..
••
> 1 ~I>.
If(x) - gdx)j1 dx = r-.If(x) - s, (X)j2 dx
Now gl (x), not being bandlimited anymore, is low-pass filtered, and the preceding
procedure is repeated, This gives the iterative algorithm
fn(x) = W'gn-I(X), go(x)=g(x)~Jf(x) (8.233)
gn(x) =go(x) + (Y-J)fn(x), n =1,2 ... •
.
where .57 is the identity operator and W' is the bandlimiting operator. In the limit as
n -+ 00, bothj, (r) and g, (x) converge tof(x) in the mean square sense [62J. It can be
shown that this algorithm is a special case of a gradient algorithm associated with a
least squares minimization problem (65]. This algorithm is also called the method of
alternating projections because the iterates are projected alternately in the space of
bandlimited and space-limited functions. Such algorithms are useful for solving
image restoration problems that include a certain class. of constraints [53, 63, 64].
•
z{m)=y(m), -M:s;m:s;M (8.238)
Given z{m), extrapolate y(m) outside the interval [-M, M].
•
Let z denote the (2M + 1) x 1 vector of observations and let y denote the infinite
vector of {y{n), Vn}. then z = Sy. Since yen) is a bandlimited sequence, Ly = y, and
•
we can wnte
z = SLy = Ay, A a SL (8.239)
•
. This can be viewed as an underdetermined image restoration problem, where A
represents a (2M + 1) x 00 PSF matrix. A unique solution that is bandlimited and
~ sin WI (m - j) (.)
y
+ (
m-""
) _
.. j=-M
.
'ir(m - j)
Xj, Iml>M (8.242)
•
This means the MNLS extrapolator is a time-varying FIR filter {L -1]".;I'l' followed by
,
a zero padder (81) and an ideal low-pass filter (L) (Fig. 8.31). ,
•
Iterative Algorithms
•
Although L is positive definite, it becomes increasingly ill-conditioned as M• in-. •
creases. In such instances, iterative algorithms that give a stabilized inverse of L are
'l~eful [65t An example is the conjugate gradient algorithm obtained by substi-
tuting A=L and go= --':z into (8.136). At n =2M + 1, let Z~Un. Then Yn~LSTZ
•
converges to y". Whenever L is ill-conditioned, the algorithm is terminated when ~n
becomes small for n < 2M + 1. Compared to the energy reduction algorithm, the
iterations here are performed on finite-size vectors, and only a finite number of "
iterations are required for convergence.
Similar to the PSWF expansion in the continuous case, it is possible to obtain the
MNLS extrapolation via the expansion
W+l
. y+ (m) = 2: aklj>k(m), \1m • (8.243)
k=l
, . , •
,.
z(m) FIR xlm) Y+(ml . I ,
filter LPF
• x L
[L-'k,m' 1 . ,
•
,
,
•
, ,
-M 0 M •
Zero •
Y=LST[SLST+~IrZ=LstL+~hrz . (8.248)
If (J'~ -;. 0, then y-;. y+, the MNLS extrapolation. A recursive Kalman filter imple-
mentation of (8.248) is also possible [65].
Example 8.9 •
Figure 8.32a shows the signal y (m) '" sin (O.0792'll'm) + sin (O.068'll'm), which is given
for -8 <: mS 8 (Fig. 8.32b) and is assumed to have a bandwidth of less than w, = O.hr.
Figures 8.32c and 8.32d show the extrapolations obtained via the iterative energy
reduction and conjugate gradient algorithms. As expected, the latter algorithm has
superior convergence. When the observations contain noise (13 dB below the signal
power), these algorithms tend to be unstable (Fig. 8.32e), but the mean square extrapo-
lation filter (Fig. 8.32f) improves the result. Comparison of Figures 8.32d and 8.32f
shows that the extrapolated region can be severely limited due to noise.
•
•
, , ,
•
•
Actual signal Observations
2::1 2
i1
1 r, I • 1
/:
;
I ( I'
y(mH
a !
z(m)
Or I
~
,
I
-1 • -1 '~ .
-2 m -2 •
m
-100 -:5(1 .. a 50 100 • -100 -50 o .' 50 100
(a)' . (b)
1 1.
A ,
Vim} 0 y{m} 0
-1
• •
Extrapolated signal
2
A
y(ml 0
-1
o 50
-100 0 100
(e)
,
(el Extrapolation in the presence (1)
of noise (13 dB below the (fl Stabilized extrapolation in the
.• s!gnall using the extrapctation presence of noise via the mean
matrix . $quare extrapolation fliter
•
inversion. The.two-dimensional DPSS are given by the Kronecker product <Pk ® <PI.
8.19 SUMMARY
In this chapter we have considered linear and nonlinear image restoration tech-
niques. Among the linear restoration filters. we have considered the Wiener filter
•
and have shown that other filters such as the pseudoinverse, constrained
, . least
squares, and smoothing splines also belong to the class of Wiener filters. For linear
observation models, if the PSF is not very broad, then the FIR'Wiener filter is quite
efficient and can be adapted to handle spatially varying ·PSFs. Otherwise, non-
recursive implementations via the FFT (or other fast transforms) should be suitable.
Iterative methods are most useful for the more general spatially varying filters.
Recursive filters and semirecursive filters offer alternate realizations, of the Wiener
filter as well as other local estimators. •
; I',
PROBLEMS
, '
8.1 (Coherent image formation) , According to Fresnel theory of diffraction, the complex
amplitude field of an object u (z ', y '), illuminated by a uniform monochromatic source
of light at wavelength A at a distance Z, is given by
, {jk
'JJ~ u(x',y') exp 2z [(x -X')2+(y _y')2] di'dy'
v(x,y)=cl }
" , -~
where c, is a complex quantity with Ic,l =; I, k = 27f/A. Show that if z is much greater
than the size of the object, then V (x, y) is a, coherent image whose intensity is propor-
tional to the magnitude squared of the Fourier transform of the object. This is also
,
called the Fraunhofer diffraction pattern of the object. .
8.2 (Optical ftltering) ' An object u (x, y) illuminated by coherent light of wavelength 'A
and imaged by a lens can be modeled by the spatially invariant system shown in Pig.
P8.2, where
,
, '
hk (x, y) = .
JA
~
k
exp{j
A
~ k
2 y2
(X + )} ,k == O, 1,2
-- lI(x.yl
r
.
ho(x.yl X hI (x, vI
: vlx.yI
I
,
i
I
•
I ,
•
,
,
I
I
I
I
,
•
I
I
I
n~rl
~I.,
do· ... I.. . .- - dl ' I I
I . I
IL ~ ~
I .
• Figure PS.2
:u(m, n) r--------------~--------,
I I
• I I
• I I
6 I w v Perfect I
ntm, n) a:w-" recording I
I
I
I
,, system ,,
1
Inverse I J
~-----------------------
filter _.J
,
8.6 Prove the first two staterr.ents made under Remarks in' Section 8.3.
. 8.7 '. A motion blurred image (see Example 8.1) is observed in the presence of additive
white noise. What is the Wiener filter equation if the covariance function of u(x, y) is
rex, y) = o 2 exp{ -0.05Ix 1- .051y I}? Assume the mean of the object is known. Give an .
algorithm for digital implementation of this filter.
8.8 Show exactly how (8.53) may be implemented using the FFt.
,
8.9 Starting from (8.57) show all the steps which lead to the FIR Wiener filter equation
(8.64). . •
8.10 (Spatially varying FIR filters) In order to derive the formulas (8.68), (8.69) for the
spatiallyvarying FIR filter, note that the random field
.
Imace Filtering and
,
P~'lto(atioo
•
results of Section 8.4, assuming :f.:Eh(m, n) = 1 and using the fact u(m, n) should be
unbiased, prove (8.68) and (8.69) under the assumption
b. If It is the Wienerfilter, then show the minimum value of (J~ is given by (8.82).
8.14 (Sine/cosine transform-based Wiener filtering) Consider the white noise-driven
model for N x N images
u(m,n) =a[u(m -l,n)+u(m + l,n) +u(m, n -l)+u(m,n + I)J
'+E(m,n), la!<tOsm,nsN-1 .
•
s. (z,; zz) = /32 ,
where u(-l, n)=u(O, n),u(N, n)=u(N-I, n), u(m, -l)=u(m; 0), u(m, N)=
•
u(m, N -I). ,
a. Show that the cosine transform is the KL transform of u(m, n), which yields the
generalized Wiener filter gain for the noise smoothing problem as
- /3'
p(k,/)= /32 + (/' 2[' k
~ 1 - 2a(eos 'IT
/N / ' )]"
+ cos orriN • Osk,/<N-1
•
, b.
If u( -1, n) = u(N, n) = utm, -1) - uim, N) = 0, show that the sine transform is
the KL transform and find the generalized filter gain.
8.15 a, Show that the spline coefficient vector a can be obtained directly:
a= (1 + :~LQ-ILTr y
• which is a Wiener filter if the noise in (8.96) is assumed to be white with zero mean
and variance (J~ and if the autocorrelation matrix. of y is )..[LQ-I LTr'.
b. The interpolating splines can be obtained by setting S = 0 or, equivalently, letting
)....... "'" in (8.100).For the data of Example 8.6 find these splines.
;
8.16 Prove that (8.110) and (8.111) give the solution of the' constrained least squares
, restoration problem stated in the text.
8.17 Suppose the object u (m, n) is modeled as the output of a linear system driven by a zero
mean unit variance white noise random field a(m, n), namely,
q(m, n) «uirn, n) = a(m, n)
,-
If u(m, n) is observed via (8.39) with S~~ = 'Y, show that its Wiener filter is identical to
the least squares filter of (8.110). Write down the filter equation when qim, n) is given
by (8.112) and show that the object model is a white noise-driven noncausal model.
Chop.S 333
8.18 Show that (8.115) is the solution of the least squares problem defined in (8.114).
8.19 If the sequences u(~, n),h(m, n),q(m, n) are periodic over an N)( N grid with DFTs
U(k, l}, H(k, I), Q(k, I), ...• then show the impulse response of the constrained least
squares filter is given by, the inverse DFf of ..
8.20 Show that IF defined in (8.86) is the pseudoinverse of SC and is nonunique when
N,N,sM,M,. .
8.21 Show that for separable PSFs, for which the observation equation can be written •
as q. = (H, @Hz)'t<, (8.136) yields the two-dimensional conjugate gradient algorithm
of (8.137).
8.22 Write the Kalman filtering equations for a first-order AR sequence u(n), which is
observed" as
yen) '" u(n) + Hu(n -1) + u(" + I)J + '!j(n)
8.23 (K-step interpolator) In many noise smoothing applications it is of interest to obtain .
the estimate which lags the observations by K steps; that is, If: ~ E[xnIYn" x...
Os n ' s 11 + K]. Give a recursive algorithm for obtaining this estimate. In image
processing applications, often 'the one-step interpolator performs quite close to the
optimum smoother. Show that it is given by
• R ArCr
X n . l =: n
-, n n
+ DCr -'[
+ t qn + 1 V n GrCr
+1 -,
-&'« 11:
]+ qn V It - n n + 1 qn + 1 lin + 1 Sn
8.24 In the semicausal model of (6.106) assume u(O,N),u(N + l,n) are known (that is,
image background is given) and the observation model of (8.171) has no blur, that is,
h(k, I) == 8(k, I). Give the complete semirecursivefiltering algorithm and identify the
fast transform used.
8.25 Show that there is no speckle in images obtained by an ideal imaging system. Show
that, for a practical imaging system, the speckle, size measured by its correlation
distance can be used to estimate the resolution (that is, R ce II ) of the imaging system.
8.26 a. Show that (8.193) is the solution of mint 6' (No) + l\Hilq. - SCNoll - u~}}.
2
• *
b. (Maximum entropy spectrum estimation) A special case of log-entropy restora-
tion is the problem of maximizing
,
.
1'(n) = - 1J'"
2'IT -"
S ()e'
III , - ds» n=O,±l,···.±P
,
The maximization is performed with regard to the missing observations, that is,
{r(n),lnl > pl. This problem is equivalent to extrapolating the partial sequence of
autocorrelations {T(n)'\" 1s p} out to infinity such that the entropy (] is maximized
and 5(00) = 2::*_..
r(n)e-J- is the SDF associated with r(n). Show that the max'
0"; 2.:~':-; (1/;>'n) where Ao'" Al > A2 > '" '" An> An + I ... and an is given by (8.229).
This means the error due to noise increases with the number of terms in the PSWF
•
expansion, , .
8.29 a. If the bandlimited function f(x) is sampled at Nyquist rate (11L\) to yield
y(m) = f(mA), -M Sm sM, what is the MNLSextrapolation ofy(m)?
b. Show that the MNLS extrapolation of a bandlimited sequence.is bandlimited and
consistent with the given observations.
•
BIBLIOGRAPHY
Section 8.1, 8.2
. For general surveys of image restoration and modeling of imaging systems:
Section 8.4
•
For FIR Wiener Filtering Theory and.more.examples:
8. A. K. Jain and S. Ranganath. "Applications; of Two Dimensional Spectral Estimation in
Image Restoration." Proc. ICASSP-1981 (May 1981): 1113-1116. Also see Prot.
lCASSP-1982 (May 1982): 1520-1523.
• •
For block by block filtering techniques:
•
Section 8.6
Section 8.7
15. T; N. E. Greville (ed.) Theory and Applications of Spline Functions. New York:
Academic Press, 1969. .
16. M. J. Peyrovian. "Image Restoration by Spline Functions." USCIPI Report No. 680,
University of Southern California, Los Angeles, August 1976. Also see Applied Optics
16 (December 1977): 3147-3153.
17. H. S. Hou. "Least Squares Image Restoration Using Spline Interpolation." Ph.D.
•
Dissertation, IPI Report No. 650, University of Southern California, Los Angeles,
March 1976. Also see IEEE Trans. Computers C-26. no. 9 (September 1977): 856-873.
. ,
Section 8.8
•
18. S. Twomey. "On the Numerical Solution of Fredholm Integral Equations of First Kind
by the Inversion of Linear System Produced by Quadrature." J. Assoc. Comput, Mach .
. 10 (January 1963): 97-101. .
19. B. R. Hunt. "The Application of Constrained Least Squares Estimation to Image
Restoration by Digital Computer." IEEE Trans. Comput' C-22 (September 1973):
805-812. '
20. C. R. Rao and S. K. Mitra. Generalized Inverse of Matrices and its Applications.
New York: John Wiley and Sons, 1971.
21. A. Albert. Regression and Moore-Penrose Pseudoinverse. New York: Academic Press,
1972.
For numerical properties of the gradient algorithms and other iterative methods and
their applications to space-variant image restoration problems:
Addison-Wesley, 1973.
•
23. T. S. Huang, D. A. Barker and S. P. Berger. "Iterative Image Restoration." Applied
Optics 14, no. 5 (MaY 1975): 1165-1168.
24. E. S. Angel and A. K. Jain. "Restoration of Images Degraded by Spatially Varying Point
Spread Functions by a Conjugate Gradient Method." Applied Optics 17 (July 1978):
2186-2190.
•
Section 8.10
For Kalman's original work and its various extensions in recursive filtering theory:
25. R. E. Kalman. "A New Approach to Linear Filtering and Prediction Problems." Trans.
ASME. Ser. D., J. Basic Engineering, 82 (1960): 35-45.
26. B. D. O. Anderson and J. H. Moore. Optimal Filtering. Englewood Cliffs, N.J.:
Prentice-Hall, 1979.
27. G. J. Bierman. "A Comparison of Discrete LinearFiltering Algorithms." IEEE Trans.
Aerosp. Electron. Syst. AES-9 (January 1973): 28-37.
28. M. Morf and T. Kailath. "Square Root Algorithms for Least-Squares Estimation."
IEEE Tram. Aut. Contr. AC·20 (August 1975): 487-497.
For FFr based algorithms for linear estimation and Riccati equations:
29. A. K. Jain and J. J asiulek. "A Class of FIT Based Algorithms for Linear Estimation and
Boundary Value Problems." IEEE Trans. Acous. Speech Sig, Proc. ASSP 31, no. 6
(December 1983): 1435-1446. •
For state variable formulation for image estimation and smoothing and its exten-
sions to restoration of motion degraded images:
Section 8.11
Recursive algorithms for least squares filtering and linear estimation of images have
been considered in:
32. A. K. Jain and E. Angel. "Image Restoration, Modeling and Reduction of Dimen-
sionality." IEEE Trans. Computers C-23 (May 1974): 470-476. Also sec IEEE Trans.
Aut. Contr. AC-18 (February 1973): 59-62.
33. A Habibi. "Tho-Dimensional Bayesian Estimate of Images." Proc. IEEE 60 (july
1972): 878-883. Also see M. Stnntzis, "Comments on Two-Dimensional Bayesian Esti-
• • mate of Images." Proc. IEEE 64 (August 1976): 1255-1257.
34. A. K. Jain and J. R Jain. "Partial Differential Equations and Finite Difference Methods
in Image Processing, Part II: Image Restoration." IEEE Trans. Aut. Control AC-23
(October 1978): 817-834. . .
35. F. C. Schoute, M. F. Terhorst and J. C. Willems. "Hierarchic Recursive Image
Enhancement." IEEE Trans. Circuits and Systems CAS-24 (february 1977): 67-78.
36. J. W. Woods and C. H. Radewan, "Kalman Filtering in Two Dimensions." IEEE Trans.
Inform. Theory IT-23 (July 1977): 473-482.
37. S. A. Rajala and R. J. P. De Figueiredo. "Adaptive Nonlinear Image Restoration
by a Modified Kalman Filtering Approach." IEEE Trans. Acoust. Speech Sig. Proc.
ASSP-29 (October 1981):,1033-1042.
38. S. S. Dikshit, "A Recursive Kalman Window Approach to Image Restoration." IEEE
Trans. Acoust. Speech Sig, Proc. ASSP-30, no. 2 (April 1982): 125-129.
•
'The performance of recursive filters can be improved by adapting the image model
to spatial variations; for example:
39. N. E. Nahi and A. Habibi, "Decision Directed Recursive Image Enhancement." IEEE
•
Trans. Cir. Sys. CAS-22 (March 1975): 286-293.
Section 8.12
•
Semirecursive filtering algorithms for images were introduced in:
40. A. K. Jain. "A Semicausal Model for Recursive Filtering of Two-Dimensional Images."
IEEE Trans. Computers C·26 (April 1977): 345-350.
•
Section 8.13
41. J. C. Dainty (ed.). Laser Speckle. New York: Springer Verlag, 1975.
42. J. W. Goodman. "Statistical Properties of Laser Speckle Patterns," In Laser Speckle
(41). ,
•
43. Speckle in Optics. Special Issue, J; Opt. Soc. Am. 66 (November 1976).
Section 8.14
For maximum entropy restoration algorithms applicable to images, see [5] and:
48. B. R. Frieden. "Restoring with Maximum Likelihood and Maximum Entropy," J. Opt.
Soc. Amer. 62 (1972): 511-518. . .
49. A. Lent. "A Convergent Algorithm for Maximum Entropy Image Restoration with a
Medical X-ray Application." In Image Analysis and Evaluation, SPSE Conf. Proc,
(R. Shaw, ed.), Toronto, Canada, July 1976, pp. 221-267. .
Section 8.15
•
For application of Bayesian methods for realizing MAP and ML estimators for
nonlinear image restoration problems, see [4] and:
•
52. B. R. Hunt. "Bayesian Methods in Nonlinear Digital Image Restoration." IEEE Trans.
Computers C-26,.n(\. 3, pp. 219-229.
53. H. J. Trussell. "A Relationship between Image Restoration by the Maximum A Poste-
riori Method and a Maximum Entropy Method,' IEEE Trans. Acous. Speech Sig, Proc.
ASSP-28, no. 1 (February 1980): 114-117. Also see vol. ASSP-31, no. 1 (February 1983):
129-136.
54. J. B. Morton and H. C. Andrews. "A Posteriori Method of Image Restoration." J. Opt.
Soc. Amer. 69, no. 2 (February 1979): 280-290.
•
56. G. M. Robbins. "Image Restoration for a Class of Linear Spatially Variant Degrada-
. tions." Pattern Recognition 2 (1970): 91-1m3. Also see Proc. IEEE 60 (July 1972):
862-872. .
57. A. A. Sawchuk. "Space-Variant Image Restoration by Coordinate Transformations."
J. Opt. Soc. Am. 64 (February 1974): 1~B-144. Also see J. Opt. Soc. Am. 63 (1973):
1052-1062. •
I
Section 8.17
Section 8.18
Here we follow:
60. D. Slepian and H. O. Pollak. "Prolate Spheroidal Wave Functions, Fourier Analysis and
Uncertainty-I." BSTJ 40 (Janaury 1961): 43-62.
•
340, Image Filtering and Restor,ation Chap. 8
,
•
•
Other References
•
66. E. S. Angel and A. K. Jain. "Frame to Frame Restoration of Diffusion Images." IEEE
Trans. Auto. Control A0Z3 (October 1978): 850-855.
67. B. L. McGlaimmery. "Restoration of Turbulence Degraded Images." J. Opt. Soc. Am.
. 57 (March 1967): 293-297. . .
.68. J. S. Lim and N. A. Malik. "A New Algorithm for Two-Dimensional Maximum Entropy
Power Spectrum Estimation." IEEE Trans. Acoust. Speech, Signal Process. ASSP·Z9,
1981,401-413.
• ,. •
• •
<If_ >0"
,
~'it _ _
• Image Analysis
•
9.1 INTRODUCTION
The ultimate aim in a large number of image processing applications (Table 9.1) is
to extract important features from image data, from which a description, interpreta-
tion, or understanding of the scene can be provided by the machine (Fig. 9.1). For
example, a vision system may distinguish parts on an assembly line and list their
features, such as size and number of holes. More sophisticated vision systems are
•
342 •
,
. . I Conclusion
Data analysIs -:-"'" from analysis
Input
I
I
1.1'18ge
Preprocessing
Feature
Segmenta!ion
Feature I Classification
extraction extraction and deseription
•
- ~_
Image
... -
analysis
...
system
-
Image unden1tanding system
---- " .. _ - - - - - _ ...
.
_ .. .
•
.
I•
•
I .
•
Symbolic ; Interpretation
representation -: and description
•
I
Figure 9.1 A comIJuter vision system
able to interpret the results of analyses and describe the various objects and their
relationships in the scene. In this sense image analysis is quite different from other
image processing operations, such as restoration, enhancement, and coding, where
•
the output is another image. Image analysis basically involves the study of feature
extraction, segmentation, and classification techniques (Fig. 9.2).
,· . In computer vision systems such as the one shown in Fig. 9.1, the input image
is first :Ireprocessed, which may involve restoration, enhancement, or just proper
representation of the data. Then certain features are extracted for segmentation of •
the image into its components for example, separation of different objects by
extracting their boundaries. The segmented image is fed into a classifier or an image.
understanding system. Image classification maps different regions or segments into
one of several objects, each identified by a label. For example, in sorting nuts and
bolts, all objects identified as square shapes with a hole may be classified as nuts and
those with elongated shapes, as bolts. Image understanding systems determine the
relationships between different objects in a scene in order to provide its description.
For example, an image understanding system should be able to send the report:
The field of view contains a dirt road surrounded by grass.
Such a system should be able to classify different textures such as sand, grass, or
corn using prior knowledge and then be able to use predefined rules to generate a
description.
Image analy'i. techniques
r
Figure 9.2 .
Spatial features of an object may be characterized by its gray levels, their joint
probability distributions, spatial distribution, and the Ii ke.· .
•
Amplitude Features
Histogram Features -
Histogram features are based on the histogram of a region of the image. Let u be a
random variable representing a gray level in a given region of the image. Define
( ) A P b] _ l>
number of pixels with gray level x
Pu x = ro u -x - total number of pixels in the region' (9.1)
x=O, ...,L-l
Common features of P. (x) are its moments, entropy, ,and so on, which are defined
next.
L-l
Moments:. mi=E[ui]=L xip.(x), i =1,2, ... (9.2)
'.r; 0
l::::
L-I
Absolute moments: mi"'" E[lul~"'" L Ix lip. (x) (9.3)
L-l
Central moments: lki=EHu-E(u)J}= L (X-ml)'Pu(X) (9.4)
x =0
L-l
/ Entropy:
L-I (9.6)
= - L p. (x) log2P. (x) bits
•
x=o
Some of the common histogram features are dispersion = (1.1> mean = ml>
variance = lk2, mean square value or average energy =m2, skewness = lk3, kurtosis
• • •
=
lk4 - 3. Other useful features are the median and the mode. A narrow histogram
indicates a low contrast region. Variance can be used to measure local activity in the
• amplitudes. Histogram features are also useful for shape analysis of objects from
their projections (see Section 9.7).
•
1 .
p-j(k, I) = N LL [u(m - k, n -I)..., m, (k, I)Ji (9.8)
w (m, n) t W
where i = 1,2, ... and N; is the number pixels in the window W. Figure 9.3 shows
the spatial distribution of different histogram features measured over a 3 x 3
moving window. The standard deviation emphasizes the strong edges in the image
and dispersion feature extracts the fine edge structure. The mean, median, and
mode extract low spatial-frequency features.
Second-order joint probabilities have also been found useful in applications
such as feature extraction of textures (see Section 9.11). A second-order joint
probability is defined as
P. (Xt, Xl) ~ PUl>U~ (Xt,Xl) ~ Prob[ul '" Xl, Uz = X2], Xl> Xz ;. 0, ... ,L -1
(9.9)
"'" __n=u:;:m;;.:b:.:e:.,:r_o:..f~p..;;aI;;.;·rs;.-:O:.:f:..,p£.l:.:'x:.,:e:.,ls:..:::u..:.1_-,.:.x.:..1:..;.,:..u:.::,2_x;;,:;,.2r-rr--
total number of such pairs of pixels in the region
•
/"''''''
j>o.,
•
. .
,. ..,.
~''1:
,.
,. " • .' ,
'.
,J
. I
•
~"
•
- . .' .'
we have pu (0) =f. pu (1) =i, p. (2) = t and p. (3) =~. The second-order histogram for
UI = u(m, n), U2" u(m + 1, /I + 1) is the 4 x 4 concurrence matrix
J. -+X2
xIII 1 0 •
I 0 2 1 0
91100
o 1 0 0
•
Image transforms provide the frequency domain information in the data. Transform
features are extracted by zonal-filtering the image in the selected transform space
(Fig. 9.4). The zonal filter, also called the-feature mask, is simply a slit or an aper-
ture. Figure 9.5 shows different masks. and Fourier transform features of different
shapes. Generally. the high-frequency features can be used for edge and boundary
detection, and angular slits can be used for detection of orientation. For example, .
an image containing several parallel lines with orientation 11 will exhibit strong
energy along a line at angle 7'1/2 + e passing through the origin of its two-
dimensional Fourier transform. This follows from the properties of the Fourier
transform (also see the projection theorem, Section lOA). A combination of an
angular slit with a bandlimited low-pass, band-pass or high-pass filter can be used
for discriminating periodic or quasiperiodic textures, Other transforms, such as
Haar and Hadamard, are also potentially useful for feature extraction. However, .
systematic studies remain to be done to determine their applications. Chapters 5
and 7 contain examples of image transforms and their processed outputs.
Transform-feature .extraction techniques are also important when the source
data originatesin the transform coordinates. For example, in optical and optical-
digital (hybrid) image analysis applications, the data can be acquired directly in the
Fourier domain for real-time feature extraction in the focal plane.
•
• •
, ,
Input ulm.n) Forward Vlk,ll vi!;'./! Invef1l/l !I(m.n)
image transform X mnsform
. •
.
•
Mask
• ·g(k.1l
. •
,, ,
"'-
,g,
iL
f~ -
1<'
l{:
t " ,.
. '- '
, .".
" ---," ,' -. ;-
, l l' ; -t
, _ , ,". • 7 ' "1 /.; ;"'.~
'-l.. ':' "'''t.
,""~, .~', ",~.
,~
Y'7~,
,,-'
0- I; ,
-
'0 ".... •. , .,
Ie) Triangular and vertical shapes ld) 45' orientation and the letter J
•
fix, y) "
n
,~ Edge '. ,
•
r
•
\
\
ty \
'--:-'--'- --
Figure 9.6 Gradient of f{x, y) along r
, 'i-- ' x . direction.
-.;.....
where EEl denotes the logical exclusive-Ole operation. For a continuous image
f(x, y) its derivative assumes a local maximum in the direction of the edge. There-
fore, one edge detection technique is to measure the gradient of f along r in a
direction e (Fig. ,9.6), that is,
af ilfax ill ily . .
- = - - +- - = f, cos e +}; sme (9.11)
ilr ax ar ay ar x Y
The maximum value of ill/ilr is obtained when (a/aa)(aflilr) "" O. This gives
Gradient Operators
These are represented by a pair of masks H h H 2 , which measure the gradient of the
. image u(m, n) in two orthogonal directions (Fig. 9.7). Defining the bidirectional
". gradientsg!'(m, n) ~ (U, Hi }m, .,g2 (m, n) ~ (V, H 2 }",•• the gradient vectormagnitude
and
-, direction
. are given by ,
,. gem, n) =;: Vgi(m, n) + gHm, n) . (9.14)
Isotropic -1 0 1 • -1 -Vi -1
-Vi [2] Vi o [Q] 0
-1 0 i l' 1 Vi 1
•
•
•
r~
-~
•
, •\, .
I!<"~'
,
'? ' , '.
'
Yo
f /' ' , ,
• -iii
:Ii
c-
i
, ,
,
,'1
' ':
, '
•
'-.<.. , , '+
i
, _
;ji
',t
.JJl
.. ~
1j
,
i -- :~
•
../---- "T'·'I'. -" _--'_
-~
Figure 9.8 Edge detection examples. In each case, gradient images (left), edge
maps (right).
that 5 to 10% of pixels with largest gradients are declared as edges. Figure 9,8a
shows the gradients and edge maps using the Sobel operator on two different
•
Images.
Compass Operators
•
Compass operators measure' gradients in a selected number of directions (Fig. 9.9). •
Table 9.3 shows four different compass gradients for north-going edges. An anti-
0
clockwise circular shift of the eight boundary elements of these masks gives a 45
rotation of the gradient direction. For example, the eight compass gradients corre-
sponding to the third operator of Table 9:3 are . ,
•
350 Image Analysis and Computer Vision Chap. 9
r
•
1 1 1 t 1
£ 0 <,1 1 o -1
o -1
- o -1 -1 «
1 o -1 (SW)
0 0 "0 (N) 1 o -1 (NW) 1 (W)
":'1 -1 -1 o -1 -1 1 o -1 1 1 0
-1 -1 ~1 ! '
0 0 0(8)
"
-1 -1
-1 0
0 \.
1 (SE)
,(
• -1
-1
0
0
1
1 (E)
- -1 0
0 1 1 J'
1 (NE)
1 1 :1 0 1 1 -1 0 1 -1 -1 0
, Let g" (~, n) denote the compass gradient in the direction (lk = 'IT/2 + k1r/4,
k = 0, ... ,7. The gradient at location (m, n) is defined as
gem, n) A max{ig.(m, n)1} (9.19)
• •
which can be thresholded to obtain the edge map as before, Figure 9.8b shows the
results for the Kirsch operator. Note that only four of the preceding eight compass
gradients are linearly independent. Therefore, it is possible to define four 3 x 3
arrays that are mutually orthogonal and span the space of these compass gradients.
These arrays are.called orthogonal gradients and can be used in place of the compass'
gradients [12]. Compass gradients with higher angular resolution can be designed by
increasing the size of the mask. .
5 5· 5 121
2} -3 IJD -3 (Kirsch) 4) 0 @] 0
-3 -3 -3 -1 -2 -1
•
The foregoing methods of estimating the gradients work best when the gray-level
transition is quite abrupt, like a step function. As the transition region gets wider
(Fig, 9.10), it is more advantageous to apply the second-order derivatives. One
frequently encountered operator is the Laplacian operator, defined as
,
•
. iPf iif
. '. ax 2 +~
.Vlf= iJy 2 •
(9.20)
•
•
fIx)
df
-
dx
Table 9.4 gives three different discrete approximations of this operator. Figure 9.8d
shows the edge extraction ability of the Laplace mask (2). Because of the second-
order derivatives, this gradient operator is more sensitive to noise than those pre-
viously defined. Also, the thresholded magnitude of \72f produces double edges.
For these reasons, together with its .inability to detect the edge direction, the
Laplacian as such is not a good edge detection operator. A better utilization of the
Laplacian is to use its zero-crossings to detect the edge locations (Fig. 9.10). A
generalized Laplacian operator, which approximates the Laplacian of Gaussian
functions, is• a powerful zero-crossing detector [13]. It is defined as
•
2+n 2 2+n 2
6. (m ) m
him, n) "" c 1- 2 exp - '. 2 2 (9.21)
.er er
•
where ercontrols the width of the Gaussian kernel and c normalizes the sum of the
elements of a given size mask to unity. Zero-crossings of a given image convolved
with h(m, n) give its edge locations. 'On a two-dimensional grIl:!, a zero-crossing is
said to occur wherever there is a zero-crossing in at least one
•
direction.
•
I
•
u(m. nl I I I
I • I I I
I I
I I
t I
I
I
I
I
I
I
I
I
I
I
I
I
I I I I I I
I I I
I I I I I I
I I I .
I I I I I I I n
,
j I
. I
I
I
I
. I
I
I
I
I
I
I
I
I
I
I
•
I I I I I I I, I I
, I ! I I I I ! I I
•,
X X X X 0 ,
X X .", X
in
X X X X 0 X X X X
1
•
X X
• A " . ..... ,
X u, 'Xl
c."
.
oP 1"'-'
IX'
l.,;_:J
A
Ub X X x m
X X X X 0 X ·x X X
Wo
•
•• ..,._,rX-'
~ X X X X x X X
•
•
H, •
Figure 9.11 Edge model with transition region one pixel wide.
•
TABLE 9.5 Stochastic Gradients H, for Edge Extraction of Noisy Images with To (k, I) = .99"11"2+12, H2 ~ Hi;
SNR = a 2/a2,. •
. '
' '
• SNR=l SNR=9
-3
'Ill
0.802 0.836
0.845 0.897
0 -0.836
0 -0.897
-0.802
-0.845
0.267
0.373
0.364 0
0.562 0
-0.364
-0.562
-0.267
-0.373
1IJ
III 5><5 0.870 1.00 @] -1.00 -0.870 0.463 1.00 . @] -1.00 . -0.463
;-' 0.845 0.897 0 -0.891 -0.845 0.373 0.562' 0 -0562 -0.373
.-~
III
o::T
m • •
'P •
CD
• •
To detect the presence of an edge at location P, calculate the horizontal gradient,
for instance, as .
. gJ (m, n) ~ uf(m, n - 1) - u. (m, n, + 1) . (9.23)
Here uf(m, n) and Ub (m, n) are the optimum forward and backward estimates of
u (m, n) based on the noisy observations given over some finite regions Wof the left-
and right half-planes, respectively. Thus uf(m, n) and {I b (m, n) are sernicausal esti- .
mates (see Chapter 6). For observations v (m, n) containing additive white noise, we
can find the best linear mean square semicausal FIR estimate of the form
The filter weights atk, I) can be determined following Section 8.4 with the
modification that W is a semicausal window. [See (8.65) and (8.69)]. The backward
semicausal estimate employs the same filter weights, but backward. Using the
definitions in (9.23), the stochastic gradient operator HI is obtained as shown in
Table 9.5. The operator H 2 would be the 90" counterclockwise rotation of HI, which,
due to its symmetry properties, would simply be Hr.
These masks have been
normalized so that the coefficient a(O,O) in (9.24) is unity. Note that for high SNR
the filter weights decay rapidly. Figure 9.8c shows the gradients and edge maps
obtained by applying the 5 x 5 stochastic masks designed for SNR = 9 but applied to
noiseless images. Figure 9.12 compares the edges detected from noisy images by the
Sobel and the 'stochastic gradient masks.
Edge detection operators can be compared in a number of different ways. First, the
image gradients may be compared visually, since the eye itself performs some sort of
edge detection. Figure 9.13 displays different gradients for noiseless as well as noisy
images. In the noiseless case all the operators are roughly equivalent. The stochastic
gradient is found to be quite effective when noise is present. Quantitatively, the
performance in noise of an edge detection operator may be measured as follows.
Let no be the number of edge pixels declared and nt be number of missed or new
edge pixels after adding noise. If no is held fixed for the noiseless as well as noisy
images, then the edge detection error rate is
•
(9.25)
,
In Figure 9.12 the error rate for the Sobel operator used. on noisy images with
SNR"" 10 dB is 24%, whereas it is only 2% for the stochastic operator.
Another figure of merit for the noise performance of edge detection.operators
is the quantity
p= 1 .2 1
•
(9.26)
• max(Nl • ND) i. I 1 + adf •
..
- "-
..
"
".J.
• •
- ..
,
• •
•
, •:;, ••
• ,•.
,
t.- •
!Ie
'.
I
• • >.
,
.. , .
• ".
. ;
I ."r"- - _
• " •
>-
, .-
• .'~ •
f
fe , ,
t_ f .
. ,"; ,
I
•
><'~~
...
" ,'. - -- .. ---~
,, .
, ",
,
. '-'.;,'
~k
.. ~ <.
• •
c"'" '1 ,,-' • •
"" .'~,,- , • , ,
-,. ,
')'r~_'
~
4 •
Figure 9.12 Edge detection from noisy images, Upper two, Sobel, Lower two,
stochastic,
where d, is the distance between a pixel declared as edge and the nearest ideal edge •
pixel, a is a calibration constant, and N1 and N D are the number of ideal and
detected edge pixels respectively, Among the gradient and compass operators of
Tables 9,2 and 9:3 (not including the stochastic masks), the Sobel and Prewitt
operators have been found to yield the highest performance (where performance is
proportional to the value of P) [17]. .
Lines are extended edges. Table 9.6 shows compass gradients for line 'detection.
Other forms of line detection require fitting a line (or a curve) through a set of edge
points. Some of these ideas are explored in Section 9.5.
-..
-- ~
,.
..
--
,
-
'.'
". . ' ,
., .'
'~
'. t" ,
. ... --
".
~>
..
'.'
,
.,
- , ,' -.. . r,
,-- , •
. '
," ', ",'
;'
, '
.
\
,,
.
,-,-,~,
,
.. .. -.
(a) Gradients for noiseless image (b) Gradients for noisy image
1 -,
?
.
Spots are isolated edges. These are most easily detected by comparing the
, value of a pixel with an average or median of the neighborhood pixels. '
Boundaries are linked edges that characterize the shape of an object. They are
useful in computation of geometry features such as size or orientation.
Connectivity
,
Conceptually, boundaries can be found by tracing the connected edges. On a rec-
•
tangular grid a pixel is said to be four- or eight-connected when it has the same
properties as one of its nearest four or eight neighbors, respectively (Fig. 9.14).
There are difficulties associated with these definitions of connectivity, as shown in
Fig. 9.14c. Under four-connectivity, segments 1,.2,3, and 4 would be classified as
disjoint, although they are perceived to form a connected ring. Under eight-
connectivity these segments are connected, but the inside hole (for example, pixel
B) is also eight-connected to the outside (for instance, pixel C). Such problems can
.
,
! , c
I ,
, I
.. .'
,
. I
I
II . ,
,
, ·
,
·
!I
, - .','
, B , r~
! '.
J
•
, , , .
.. ' , , . -J
j4
UT
• ,
• .;"
. .,
, , , . ",
A "
A 2 .'. ·
.'.' ,.: -r
• ,, .; -
, ,,
",
, , . . '.
•
, , %~'1
.'. •
,
, , , , ·
. . .
'C-. , • • , • , •
· ~
.. .. I
<:
I ' . .'
/~:::~{
J ,
"
,
3
1.1 , (b! (el
,
Figure 9.14 Connectivity on a rectangular grid. Pixel A and its (,,) 4-connected
and (b) s-coonectcd neighbors; (c) connectivity paradox: "Are Band C con-
nected?" •
Contour Following
As the name suggests, contour-following algorithms trace boundaries by ordering
successive edge points. A simple algorithm for tracing closed boundaries in binary • •
images is shown in Fig. 9.15. This algorithm can yield a coarse contour, with some of
the boundary pixels appearing twice. Refinements based on eight-connectivity tests
for edge pixels can improve the contour trace [2]. Given this trace a smooth curve,
such as a spline, through the nodes can be used to represent the contour. Note that
this algorithm will always trace a boundary, open or closed, as a closed contour. This
method can be extended to gray-level images by searching for edges in the 45° to
135° direction from the direction of the gradient to move from the inside to the
outside of the boundary, and vice-versa [19J. A modified version of this contour-
following method is called the crack-following algorithm [25J. In that algorithm each
pixel is viewed as having a square-shaped boundary, and the object boundary is
traced by following the edge-pixel boundaries.
•
358 Image Analysis and Computer Vision Chap. 9
2 3
•
B
16
--- 4
1
....
5 12 13
/
17
.... ,, /
/
/
.... 7 11 14 15 19
34 35
, 10 18
I
I ", /
/
•
I
A
" , /
/
9 20 21
32
33 , a
", 2& 24
,
" .... -, 30 ..-
..... "' .....
--- 27
26 23
.,• 22
31
29 28
Algorithm
1. Start inside A (e,g., 11
2. Turn left and step to next pixel if
in region A. (e,g., 1 to 21, otherwise
• tum right and step (e.g., 2 to 3)
3. Continue until arrive at starting
point 1
tutes the boundary path. The speed of the algorithm depends on the chosen 4>0
[20,21]. Note that such an algorithm need not give the globally optimum path. '
EXliIJIlple 9.2 Heuristic search algorithms [19)
Consider a 3 x 5 array of edges whose gradient magnitudes Igi and tangential contour
directions e are shown in Fig. 9.16a. The contour directions are at 90" to the gradient
directions. A pixel X is considered to be linked to Y if the latter is one of the three
eight-connected neighbors (1';, Yz, or Y3 in Fig. 9.16b) in front of the contour direction
and if 16(x) -ll(y)[ < 90". This yields the graph of Fig. 9.16c. •
As an example, suppose <i>(Xk) is the sum of edge gradient magnitudes along the
path from A to xi, At A, the successor nodes are D, C, and G, with 4>(D), = 12,
<f>(C) = 6, and <I>(G) = 8. Therefore, node D is, selected, and C and G are discarded.
From here on nodes E, F, and B provide the remaining path. Therefore, the boundary
"path isADEFB. On the other hand, note that path ACDEFB is the path of maximum
cumulative gradient.
Dynamic Programming
•
Dynamic programming is a method offinding the global optimum of multistage pro-
cesses. It is based on Bellman's principal of optimality [22], which states that the
optimum path between two given points is also optimum between any two points lying
'/: "'-"85
J'
•
•
5
• Y ~ 6
•
3 2 3
~ • • . ,,110
•
•
(e) Gradient magnitudes (b) Linkage rules (c) Graph interpretation
contour directions
on the path. Thus if C is a point on the optimum path between A and H (Fig. 9.17),
then the segment CB is the optimum path from C to H, no matter how one arrives
at C. . •
To apply this idea to boundary extraction [23], suppose the edge map has been
converted into a forward-connected graph of N stages and we have an evaluation
function . .
N N tV
S(x!>xz, ... 'Y-N,N) ~ 2: Ig(Xk)/- a 2: IO(~) - O(xk-l)j-13 2:d(XhXk.,.l) (9.27)
k=l k=2 k=2
•
Here Xb k = 1, ... ,N represents the nodes (that is, the vector of edge pixel loca-
tions) in the kth stage of the graph, d(x, y) is the distance between two nodes x and
y; jg(xk)l, e(Xk) are the gradient magnitude and angles, respectively, at the node Xk,
and a and ~ are nonnegative parameters. The optimum boundary is given by
connecting the nodes Xb k = 1, ... ,N, so that S(xj,xz, ... ,xN,N) is maximum.
Define '.' . .
d ..
¢l(XN,N) = max {S(x!> ... ,xN,N)} (9.28)
. "l···7l.N-l
Using the definition of (9.27), we can write the recursion
•
•
~ 5(x1.'" ,XN-hN -1) +!(XN-l> XN)
•
8 .
••
F~ 9.17 Bellman's principle of opti-
•
• • mality. If. the path AB Is optimum, then .
D so is CB no matter how you' arrive at C..
<P(Xh 1) ~ Ig (xl)1
This procedure is remarkable in that the global optimization of S(Xl>' .. , xN,N) has
been reduced to N stages of two variable optimizations. In each stage, for each
value of Xk one has to search the optimum <fl(Xh k). Therefore, if each x, takes L
different values, the total number of search operations is (N - l)(L 2 - 1) + (L - 1).
This would be significantly smaller than the L N - 1 exhaustive searches required for
direct maximization of S (X2, X2, .•• , XN, N) when Land N are large..
Example 9.3
Consider the gradient image of Fig. 9.16. Applying the linkage rule of Example 9.2
and letting a = 411r, (:l = 0, we obtain the graph of Fig.9.I8a which shows the values
of various segments connecting different nodes. Specifically, we have N = 5 and
~(A, 1) = 5. For k = 2, we get
D E . F
•
1 -2 3 4 Ik
,
... .." (b) <1>1"., kj at vedcus .tall'>'. Solid
line gives the optimal path.
,
<
. ..
Sec. 9.5 Boundary Extraction 3151
.'
y 8
. • (',8)
'----------.o_ .
lal Straight line fbI .Hough transform
•
C(Sh ( 1) = C(Sh Ill) + 1, if Xi cos 6 + Yi SiDO = Sk for 0 = 61 . (9.32)
Then the local maxima of C(s, 0) give the different straight line segments through
the edge points. This two-dimensional search can be reduced to a one-dimensional
search if the gradients 6; at each edge location are also known. Differentiating both
,. sides of (9.31) with respect to x, we obtain
5!I
dx
= -cot 6 = tan(2: + (1\
2 j
·(9.33)
•
Hence C(s, 9) need be evaluated only for 6 = -rr/2 - ai • The Hough transform can
also be generalized to detect curves other than straight lines. This, however,
increases the dimension of the space of parameters that must be searched [3J. From
Chapter 10, it can be concluded that the Hough transform can also be expressed as
the Radon transform of a line delta function.
In chain coding the direction vectors between successive boundary pixels are en-
coded. For example, a commonly used chain code (Fig. 9.20) employs eight direc-
tions, which can be coded by 3-bit code words. Typically, the chain code contains
the start pixel address followed by a string of code words. Such codes can be
generalized by increasing the number of allowed direction vectors between
successive boundary pixels. A limiting case is to encode the curvature of the contour
as a function of contour length t (Fig. 9.21).
A
Algorithm:
1 . 1. Start at any boundary
3 2 pixel. A,
2. Find the nearest edge
pixel and code irs
orientation. In case
of a tie. choose the
one with largest
, for smallest) code
value,
3. Continue until there
7 are no more boundary
6 pixels.
4
•
lbl Contour
v 8
u::::r::::L--L.l.:.LJ....:~_ t
Algorithm. Approximate the curve by the line segment joining its end points
(A, B). If the distance from the farthest curve point (C) to the segment is greater
" ,
than a predetermined quantity,, join AC and BC. Repeat the procedure for new , '
segments AC and BC, and continue until the desired accuracy is reached.
H -splines are piecewise polynomial functions that can provide local approximations
of contours of shapes using a small number of parameters. This is useful because
human perception of shapes is deemed to be based on curvatures of parts of'
contours (or object surfaces) [30]. This results in compression of boundary data as
well as smoothing of coarsely digitized contours. H -splines have been used in shape
synthesis and analysis, computer graphics, and recognition' of parts from bound-
• •
anes,
Let t be a boundary curve parameter and let x(t) and yet) denote the given
boundary addresses. The B -spline representation is written as
•
x(t) ""
1
2:
t» 0
Pi B; k (t)
(9.34)
x(t) A (x (t),y (t)V,
where Pi are called the control points and the Bi,k (r), i == 0, 1, ... ,n, k = 1,2... are
called the normalized Brsplines oforder k. In computer graphics these functions are
also called basis splines or blending functions and can be generated via the recursion
A (t - l j) B j,k- l (t) (ti+k-t)Hi+l.k-l(t) k '3' ( )
B i k (t ) == + )' = 2, ,... 9.35a
, ti+k-\-ti (ti+k-ti+l
c ' D
c
,, - -- I
I /
I A A I
I I E
B '-./..., ' /
I
I
8 1:::::...,...-.
(2) (3)
The variable t is also called the node parameter, of which tl and s, are special values.
Figure 9.23 shows some of the B -spline functions. These functions are ncnnegative
and have finite support. In fact, for the normalized B-splines, 0 <"BI,k (rYo< 1 and the
region of support of Bi.k (r) is [ti, tl + k). The functions BI• k (t) form a basis in the space
of piecewise-polynomial functions. These functions are called open Bisplines or
closed (or periodic) Bssplines, depending on whether the boundary being repre-
sented is open or closed. The parameter k controls the order of continuity of the
curve. For example, for k "" 3 the splines are piecewise quadratic polynomials. For
k "" 4, these are cubic polynomials. In computer graphics k: = 3 or 4 is generally
found to be sufficient. .
When the knots are uniformly spaced, that is,
Vi (9.37a)
•
the Bi.k(t) are called uniform splines and they become translates ofBo,k (t), that is,
•• •
• ,
Bi.k(t) =BO,k(t -i), i =k -1,k, ... ,n - k + 1 (9.37b)
Near the boundaries BI.k(t) is obtained from (9.35). For uniform open B-splines
with I1t = 1, the knot values can be chosen as
o i<k
t;= i-k+l k cSi is n l ' (9.38)
•, ,
n -k +2 i>n "•
and for uniform periodic (or closed) B -splines, the knots can be chosen as
tl=imod(n+l) (9.39)
1 ;0
Constant
Unear
For k = 1,2,3,4 and knots given by (9.39), the analyticforms of B n,dt) are pro-
vided in Table 9,7.
Control points. . The control points Pi are not only the series coefficients in
(9.34), they physically define vertices of a polygon that guides the splines to trace a
smooth curve (Fig, 9.24), Once the control points are given, it is straightforward to
obtain the curve trace x(t) via (9.34). The number of control points necessary to
reproduce a given boundary accurately is usually much less than the number of
points needed to trace a smooth curve. Data compression by factors of 10 to 1000
can be achieved, depending on the resolution and complexity of the shape.
A B-spline-generated boundary can be translated, scaled (zooming or shrink-
ing), or rotated by performing corresponding control point transformations as
follows:
•
t'
-6' Ost<1
05t<1 -3t 3 + 12(2 -121 + 4
6 -, lst<2
1St<2 Bo•• (I) = 3t' - 24(2 + 60( - 44 2<t<3
25t<3 6 '
(4 - t)3
otherwise 3S t. <4
6 '
0, otherwise'
p .
lal Ibl
._,' ••_Al." ,
Control /"... '....
......................... _ , / points "./ '1
j"
.",.....
:
*,•••••••••
...
. .\ '
......' ••••• • ••••••••_...
....
'"
,..
•.•' ."
v ' . . .....)i....
tI.
.. """ .
", ," .. ~. -
•.........i
I
r
l
•••• ,,~,
..
t "
..,
","',
.
I ",
. ""
t..
Y
\
e,.... ,.. ,............
j
.
,.;l
"'-. .
"...
/ : . . . " """"
.'
r r ... •'t"""l.
,
\r
"1\"
... /
"'"
-,
......
.-..;
r'
'"
.•'
..•. -......,.
'.
..~.
l
\ ,'.:/~
f'.' "", :'
• -.f
.~
.~
\
.
~
.......,.... ... -..-..-............•
"
!
.J
.
,."., '-" i
,"
''J
...
..
1_"
.. .
",
..\;..... ........
" ~ ..•..
.-I ? "-'1
4'
"
./
J
leI ldl
Figure 9.24 (a), (b) B-spline curves fitted through 128 and. 11M points of the
original boundaries containing 1038 and 10536 points respectively to yield indistin-
guishable reproduction: (c), (d) corresponding 16 and 99 control points respec-
tively, Since (a), (b) can be reproduced from (c), (d). compression ratios of greater _
than 100: 1 are achieved,
3 5 7 . 2n - 1 n 1]
•
[sO,S"S2,: .. ,Sn]= [ 2"'2"'2'''''' 2 '22
Then from (9.47), the blending function B".3 (t) gives the circulant matrix
1
1
B3 = &
1
6
(b) Nonperiodic case: From (9.36) lind ,(9.38) the knots and nodes are obtained as
[to, t., ... ,tn ... 3] = [0,0,0,1,2,3, ... .n - 2, n - 1, n - 1, n - I}
B
0.3
(r) =- (t ~ 1)', O:=;t< 1 .
0, 1:::;;tSn-1
-~(t-~r+;, OSl<l
~Y-2Y, lSI<Z
0, ZStSn-1
0, O:::;;t<j-Z
~ (t - j + 2)" j-2Sl<j-1
OSt<n-3
n -3$( <n-2 ,
.
n-2S($n-1
•
0$«n-2
• n-2StSn-l •
Least squares techniques can now be applied to estimate the control points Pi'
With proper indexing of the sampling points ~i and letting the ratio (m + l)/(n + 1)
• •
I
• • j
• • Control
points
•
... /~ •
~ Y
• •
• i
\ •" .
• •
• •
(bl (el
.' .. "
Figure 9.25 (a) Given points; (b) quadratic periodic B-spline interpolation;
(c) quadratic nonperiodic B-spline interpolation.
. .
~ ,
Fourier Descriptors
The complex coefficients a (k) are called the Fourier descriptors (FDs) of the bound-
ary. For a continuous boundary function, u (t), defined in a similar manner to (9.49), .
the FDs are its (infinite) Fourier series coefficients. Fourier descriptors have been
found useful in character recognition problems [32]. i ·
(x,y)
\,
\
\
,,
c
-- \,
8 «x. V)
_~:::"_I..-_-.,-_...:ll. ~_x
where (9.53)
-(A + jB)C
~ . -(A + jB)2
'Y = A2 + B 2 ' exp(j26) = A 2 + B2 •
For example, if the line (9.52) is the x-axls, that is, A = C "" 0, then e = 0, 'Y = 0 and
the new FDs are the complex conjugates of the old ones. .
Fourier descriptors are also regenerative shape features. The number of de-
scriptors needed for reconstruction depends on the shape and the desired accuracy.
, Figure 9.27 shows the effect of truncation and quantization of the FDs. From Table
9.8 it can be observed that the FD magnitudes have some invariant properties. For
example Iii (k)l, k = 1,2, ... ,N -1 are invariant to starting point, rotation, and
reflection. The features Ii (k)/Ici (k)1 are invariant to scaling. These properties can be
used in detecting shapes regardless of their size, orientation, and so on. However,
the FD magnitude or phase alone are generally inadequate for reconstruction of the
original shape (Fig. 9.27).
(9.54)
•
.
, is small. The parameters uo, tt, nQ, and eQ are chosen to minimize
, .
the effects of
translation, scaling, starting points and rotation, respectively. If u(n) and v(n) ate
normalized so that Iu(n) = Iv(n) = 0, then fora given shiftllo, the above distance is
•
•
• FD real part
• o
•
-59.13 '--'----~----------_~
1 128
69Jl1
•
FO imaginary part
"-
(a] •
•
-69.61 \
128
(b)
(c) rdl
• •
reI
Figure 9.27 Fourier descriptors. (a) Given shape; (b) FDs, real and imaginary
.components; (c) shape derived from largest five FDs; (d) derived from a1I1'Os
quantized to 171evels'each; (el amplitude reconstruction; (f) phase reconstruction .
• ,
minimum when
,
,
::E c(k) COS(lJIk +kep + 60)
k .
(9.55)
and
Le(k) sin($k + k<?)
k . .
tan eo = -
L c(k) COS($k + k<?)
k
where a(k)b* (k) = c(k)ei<l<k, <?.<l -2Tin oIN, and c(k) is a real quantity. These equa-
tions give a and eo, from which the minimum distance d is given by
(9.61)
• •
Sec. 9.6 Boundary Representation 373
are used [31]. The latter has the advantage that () (t) does not have the singularities.
at corner points that are encountered in polygon shapes. Although we have now
only a real scalar set of FDs, their rate of decay is found to be much slower than
those of u (t). .
Autoregressive Models
where x(n) !J. XI (n) and yen) ~ X2 (n). Here u, (n) is a zero mean stationary random
sequence, lJ.1 is the ensemble mean of Xi (n), and !lien) is an uncorrelated sequence
with zero mean and variance f3~, For simplicity we assume 101 (n) and 1>2 (n) to be
independent, so that the coordinates XJ (n) and X2 (n) can be processed indepen-
•
dently. For closed boundaries the covariances of the sequences (x; (n)};i = 1,2 will
be periodic. The AR model parameters a, (k), a7, and fJ., can be considered as
features of the given ensemble
•
of shapes. These features can be estimated from a
given boundary data set by following the procedures of Chapter 6.
for a class of objects, can also be used for compression of boundary data,
Xl (n),Xl (n) via the DPCM method (see Chapter 11).
,
•
9.7 REGION REPRESENTATION
The shape of an object may be directly represented by the region it occupies. For
example, the binary array
. Run-length Codes
•
Any region or a binary image can be viewed as a sequence of alternating strings of Os
and Is. Run-length codes represent these strings, or runs. For raster scanned
regions. a Simple run-length code consists of the start address of each string of Is (or
Os), followed by the length of that string (Fig. 9.28). There are several forms of
run-length codes that are aimed at minimizing the number of bits required to
represent binary images. Details are discussed in Section 11.9. The run-length codes
have the advantage that regardless of the complexity of the region, its representa-
tion is obtained in a single raster scan. The main disadvantage is that it does not give
the region boundary points ordered along its contours, as in chain coding. This
makes it difficult to segment different regions if several are present in an image.
Quad-trees [34]
•
(a) Binary image
"
Figure 9.28 Run-length coding for binary image boundary ~ epresentation.
•
totally black (15) or totally white (Os). The quadrant that has both black as well as
white pixels is called gray and is further divided into four quadrants. A tree struc-
.ture is generated until each subquadrant is either black only or white oniy. The tree
can be encoded by a unique 'string of symbols b (black), w (white), and g (gray),
where each g is necessarily followed by four symbols or groups of four svmbols
• • •
representing the subquadrants; see, for example, Fig. 9.29. It appears that quad-
tree coding would be more efficient than run-length coding from a data-
compression standpoint. However, computation of shape measurements such as
perimeter and moments as well as image segmentation may be more difficult.
Projections
3 2
,
I I H
F I H G I
D C
Code: gbghwwbwghwwgbwwb
Decode as: g(bg(bwwblwg(bwwgtbwwblll
• 4 3 2 1
•
(a) Different quadrants (hI Quad tree encoding
" +'7'
I
Definitions
Let f(x, y) 2: 0 be a real bounded function with support on a finite region !J(. We
define its (p + q)th-order moment
Note that setting/ex, y) = 1 gives the moments of the region 9? that could represent
a shape. Thus the results presented here would be applicable to arbitrary objects as
well as their shapes. Without loss of generality we can assume thatj'(z, y} is nonzero
only in the region > {x E(-1, l),y E(-1, I)}. Then higher-order moments will, in
general, have increasingly smaller magnitudes. <
(9.67)
The infinite set of moments {mp,q,p, q = 0, 1, ...} uniquely determine f(x, y), and
•
vice-versa.
The proof is obtained by expanding into power series the exponential term in
(9.65), interchanging the order of integration and summation, using (9.64) and
taking Fourier transform of both sides. This yields the reconstruction formula
f(x, y) = iF e-
j2
#«t, +,yt,) ~Oq~O m:.q.(j~~;!+q £~,~"" d~ 1 d£2 (9.68)
Unfortunately this formula is not practical because we cannot interchange the order
of integration and summation due to the fact that the Fourier transform of (j217£ tY
is not bounded. Therefore, we cannot truncate the series in (9.68) to find an
approximation of/ex, y).
Moment Matching
(9.69)
•
The coefficients gr.! can be found by matching the moments, that is, by setting the
moments of g(x. y) equal to mp,q' A disadvantage of this approach is that the co-
efficientsgi • i , once determined, change if more moments are included, meaning that
we must solve a coupled set of equations which grows in size with N.
Example 9.5
For N = 3, we obtain 10 algebraic equations (p + q <: 3). (Show!)
•
1 ! ! go. 0
, 0 1710 , 0
•
!3 "~ .1 g2,0 ..
-. I m 2,0"
1 1 1
3 ;; S. gO.2 IVo.>
• (9.70)
1
J ' 5
1
5
1
I
7
•15
1
I
g,.0
•
•
g3,o ::=4
•• ml,O
m3.0
1 1 I
ii IS 15 1,2 mt.2
I 2
I-1
P.(x)Pm(x)dx = 2
n+
1 &(rn - n)
AM
= (2p + 1)(2q + 1)
4 ff'f(X, Y)P. (x )P. ( Y)rixdY
p q
-,
where the Ap,q are called the orthogonal moments. Writing Pm (x) as an mth-order
polynomial
m
Pm (x) = 2: Cm,;xJ (9.73)
i""'O
The orthogonal moments depend on the usual moments, which are at most of the
same order, and vice versa. Now an approximation to f(x, y) can be obtained by
truncating (9.72) at a given finite order p + q = N, that is,
•
N N-p
f(x,y)=g(x,y) = 2: 2: Ap.qPP(X)Pq(y) (9,76)
p=Oq=O
The preceding equation is the same as (9.69) except that the different terms have
been regrouped. The advantage of this representation is that the equations for >"p, q
are decoupled (see Eqn. (9.74» so that unlike mp,q, the Ap• q are not required. to be
updated as the order Nis increased.
Moment Invariants •• •
• •
central moments
Ikp.q = Jf(x -:X)p (y - y)qf(x, y) dx dy (9.77)
are invariants, where x ~ ml,olmo,o,y ~ mo, I Imo,o, In the sequel we will consider
only the central moments.
Scaling. Under a scale change, x ' = ax, y ,= ay the moments of f(ax, oy)
Q 2
change to Ik;,q =" Ikp,qla + + , The normalized moments, defined as
P .
I
. _ Ikp.q
l)p, Q -r- ( I )'1' "! = (p + q + 2)/2 (9.78)
!J.o,o
are then invariant to size change.
X' _ a j3. x
y'
- "! 1> Y
(9.79)
the moment-generating function will change. Via the theory of algebraic invariants
[37], it is possible to find certain polynomials of Ikp, q that remain unchanged under
the transformation of (9.79). For example, some moment invariants with respect to
rotation (that is, for a = 1> = cos (1, j3. = -I' = sin (1) and reflection (a = -8 = cos 6,
j3. = 'Y = sin 6) are given as follows:
•
1. For first-order moments, 1ko.1 = J.lol.O = 0, (always invariant).
2. For second-order moments, (p + q = 2), the invariants are
$1 = 1k2.0 + ....0.2
• (9.80) ,
<l>z = ( !J.2,O - "!J.0.2
)2 2
+ 4P;1.1
3. For third-order moments (p + q = 3), the invariants are
$3 = (!J.3,O - 3J.loI, 2)2 + (IkO.3 ..... 31k2,li
. $4 = (J.lo3.0 + Ikl,2)2 + (IkO,3 + J.lo2.1)2
•
2
<P5 = (113,0 - 3Ikd(113.0 + Ikd[(113.0 + Ikd - 3(J.l.2,1 + 1ko.3)2] (9.81)
, -t (/ko.3- 31k2,1)(1ko,3 +'.Ik2,I)[(J.loo,3 + 1k2,1)2 - 3(J.loI,2 + 1-L),o)ZJ
2
4>6 = (112:0 -Iko,2}(J.lo3.0 + Ikd - (J.lo2.1+ 1ko.3)2] + 41kl,l (J.lo3.0 + jJ.d(110,3 + 1k2.1)
The relationship between invariant moments and IJ-p,q becomes more complicated
for higher-order moments. Moment invariants can be, expressed more conveniently
in terms of what are called Zernike moments. These moments are defined as the
projections of I(x, y) on a class of polynomials, called Zernike polynomials [36].
, These polynomials are separable in the polar coordinates and are orthogonal over
the unit circle.
Applications,- of Moment
, .
Invariants
Being invariant under linear coordinate transformations, the moment invariants are
useful features in pattern-recognition problems. Using N moments, for instance, an
image can be represented as a point in an N-dimensiomil vector space. This con-
verts the pattern recognition problem into a standard decision theory problem, for
which several approaches are available. For binary digital images we can set
f(x, y) = 1, (x, y) E ~. Then the moment calculation reduces to the separable
, computation
These moments are useful for shape analysis. Moments can also "be computed
optically [38] at high speeds. Moments have been used in distinguishing between
shapes of different aircraft, character recognition, and scene-matching applications
[39,40J. '
9.9 STRUCTURE
•
Suppose that a fire line propagates with constant speed from the contour of a
connected Object towards its inside. Then all those points lying in positions where at
1. Distance transform
Uk (m, n) = Uo (m, n) + min {uk-t.(i, j); ((I, j):A(m, n; i,j) :s; I)},
. (9.84)
A A(m.n;I.J! .
uo(m,n)= u(m, n) k-l,2,... I •
where A(m, n; i, j) is the distance between (m, n) and (I, j). The transform is
done when k equals the maximum thickness of the region.
1 1 0 0 0 0 1 0 .1 ,
,
1 P, 1 1 P, 0 0 P, 0
0 0 0 0 0 0 1 1 1
.,
liil
(c) Example of thinning.
Ol Original; •
rained is not influenced by small contour inflections that may be present on the
initial contour. The basic approach [42] is to delete from the object X simple border
'points that have more, than one neighbor in X and whose deletion does not locally
disconnect X. Here a connected region is defined as or.e in which any two points in
the region can be connected by a curve that lies entirely in the region. In this way,
endpoints of thin arcs are not deleted. A simple algorithm that yields connected arcs
while being insensitive to contour noise is as follows [43].
Referring to Figure 9.32a, let ZO(PI ) be the number of zero to nonzero
transitions in the ordered set P,., f>:J, P4' ••. ,P9 , P2• Let NZ (1\) be .the number of
nonzero neighbors of Pt. Then PI is deleted if (Fig. 9.32b)
2<NZ(PI ) s 6 ~ •
and ZO(Pj ) -1
(9.86)
and P2 • P4 • Pg = 0 or ZO(P2 ) oF 1
and Pz ' P4 • P6=O or ZO(~) oF 1
• The procedure is repeated until no further changes occur in. the ir.rage. Figure 9.32c
. gives an example of applying this algorithm. Note that at each location such as ~ we
end up examining pixels from a 5 x 5 neighborhood.
•
Sec.9.B . Structure
•
•
"The term morphology originally comes from the study of forms of plants and
.animals. In our context we mean study of topology or structure of objects from their
images. Morphological processing refers to certain operations where an object is hit
with a structuring element and thereby reduced to a more revealing shape.
Properties. The erosion and dilation operations have the following proper-
ties:
1. They are translation invariant, that is, a translation of the object causes the
same shift in the result.
•
2. They are not inverses of each other.
3. Distributivity:
X(j1(B UB ') = (XEBB)U (X(j1B /)
(9.89)
X8(B U B')=(X8B) n (X8B'}
4. Local knowledge:
(X n Z)8B=(X8E) n (Z8B) (9.90)
, .
•
S. Iteration:
(X8B)8B ' =X8(EEBB')
(9.91)
(X (j1 B) (j1B' = X EEl (B 83 B')
""""
o (l'IHI
OOl'Ill'"
Oiflllllll
DILATE
.....
o .. e. o"oeo" •
'lIl0.00
""ll>.
HIT· MISS
.0"1> , 0000
•
. " ••
CIlO".
IHn~ .. 0000
0000
0000
••••
",e • • • • •
1~iJf-c.'1infl
IN comers} . 0000
0.000 a 0"
••••e
\) IIHt
0·0 00
0000
(}3JECT STflUCTURE
ELEMEI'~'l'
•
•••• e • II"" 1Il Ii>
c ••• o .0000
4110 • • • eooo<ll
o •••• >ltOCQl QG.~eOOO.
•
~.
fbC$$
e\i!l;)ee lI'O(ll.Clll
CHIP G """ : / ' ('rigl" • 'lHI Ii' "Hll III !IJ
'1"1> '" " 1II O4JI$O.b.OtJO
<lHI' " " e e e <II I) It e e I1HH' llH!HlHI' G
Ill'll! IH~ llH~
e e e e e e e e.e
'1Ht ..
<1/
" It <I"" III <ilHI "..., IHI!
<lHlHHIH1HlHIH' '"
IIHI • 1Il '" IlHII
'" IHH;Hi'
•
IH!HII Figure '9.33 Examples of some morpho-
, 1Il logical operations.
,
•
CtOSE
..... •
........
•
. '" .
.........
.......... ••
• .1!--
-. .....
•• ••••••••
•••
(9101::l1: up
••••••••
naHOW
cM"r-<l!s)
,
,
OBJECT SKELETON
••••••• 0000'000
011I000.0
••••
0 ••••• "
........ • .y"""
00000
00.00
•••
.... •••
•••
o ill 0
.. 000
• •
•
• $J<.nh:1¢ii hlilt 00 dillt,:(lFlr.ectlld Oil
a di{itllt! Wid ('lililWl if 1M ObjIlC!
is. OOflflOCICdi·
,
OBJECT STRUCTURE
• ELEMENT THIN
.. •
"
...............
IIHHiHIII • IIHIL • • • • • • • .00000000<1>000.•
.000000001il00.0
...............
IliHI>" . . . . . " • • " • • •
••• • • • • • 11> • • •
0
0.0000000000.0
.00 00000000.
0
••••
THICK
OBJECT
•
STRUCTURE
ELEMENT
• •
••
• .................
........... " •••
• 'IHII • 4HI> • • • • • /III /III •
Morphological Transforms
The medial axes transforms and thinning operations are just two examples of
morphological transforms. Table 9.10 lists several useful morphological transforms
that are derived
, .
from the basic erosion and dilation operations. The hit-miss trans-
form tests whether or not the structure Bob belongs to X and Bbk belongs to Xc. The
opening of X with respect to B, denoted by XII' defines the domain twept by all
translates of B that are included in X. Closing is the dual .of opening. Boundary gives
the boundary pixels of the object, but they are not ordered along its contour. This
table also shows how the morphological opera~ons could be used to obtain the
. previously defined skeletonizing and thinning transformations. Thickening is the
dual of thinning. Pruning operation smooths skeletons or thinned objects by re-
moving parasitic branches.
Figure 9.33 shows examples of morphological transforms. Figure 9.34 shows
:'00 npplleation of morphological processing in a printed circuit board inspection
application. The observed image is binarized by thresholding and is reduced t-o a
single-pixel-wide contour image by the thinning transform. The result is pruned to
obtain clean line segments, which can be used for inspection of faults such as cuts
(open circuits), short circuits, and the like.
We now give the development of skeleton and thinning algorithms in the
context of the basic morphological operations.
Skeletons. Let rDx denote a disc of radius r at point x. Letsr(x) denote the
set of centers of maximal discs rDx that arecontained in X and intersect the
boundary of X at two or more locations. Then the skeleton S (X) is the set of centers
s, (x)..
S(X) = Us,(x)
,>0
= U {(X8rD)/(X8rD)d,D} (9.94)
,>0 .
where U and I represent the set union and set difference operations, respectively,
and drD denotes opening with respect to an infinitesimal disc.
To recover the original object from its skeleton, we take the union of the
circular neighborhoods centered on the skeleton points and having radii equal to the
associated contour distance. .
X= U {Sr(x)(BrD} (9.95)
,>0
We can find the skeleton on a digitized grid by replacing the disc rD in (9.94)
by the 3 x 3 square grid G and obtain the algorithm summarized in Table 9.10. Here
the operation (X8nG) denotes the nth iteration (X8G. 8G8'" 8G and
(X8nG)G is the opening of (X8nG) with respect to G•.
•
1 d d o 0 0 d d d
•
1 1 1
G= 1• 1 1 , c= 1 d , L= 1 , E= .0 1 0
1 1 1 1 d 1 o (} (}
where 1, 0, and d signify the object, background and 'don't care' states, respectively.
,
.
,
•
• ,
388 Image Analysis and comcuter Vlslon . Chap. 9
•
•
,
._-, ..
".~~-~-.-~~~ .. - .. _... - ,,_.~
"
,
'"
,
. ,
XOB""X/(X0B) (9.96)
,
where B is the structuring element chosen for the thinning and (3 denotes the
hit-miss operation defined in Table 9.10. .
To thin X symmetrically, a sequence of, structuring elements, {B} [B', g
1 sis n}, is used in cascade, where Bi is a rotated version of Bi -1.
-
XO{B}= « ... «X 0 1
B) 0 2
B ) ••• ) 0 B") (9.97)
A suitable structuring element for the thinning operation is the Lsstructuring ele-
.ment shown in Table 9.10.
The thinning process is usually followed by a pruning operation to trim the
resulting arcs (Table 9.10). In general, the original objects are likely to have noisy
boundaries, which result in unwanted parasitic branches in the thinned version. It is
the job of this step to clean up these without disconnecting the arcs.
.. .
.Sec; 9.9 Structure 3119
•
•
I A A ......-....
Primitive
structural
symbols: 8 b c d
, C "'\ d
a . ./
Object structure C e
a ,
•
>b •
1
a
•
The shape of an object refers to its profile and physical structure. These character-
istics can be represented by the previously discussed boundary, region, moment,
and structural representations. These representations can be used for matching
shapes, recognizing objects, or for making measurements of shapes. Figure 9.36
lists several useful features of shape. ',
Shape representation
•
• Boundaries
• Regions •
Geometry Moments
• Moments •
'.
• Structural and syntactic Perimeter Center of mass
,Area • Orientation
Max-min radii Bounding rectangle
and eccentricity !lest-fit ellipse
Corners Eccentricity
Roundnes,;
Bending energy
•
Holes
Euler number
Symmetry
, ,.
Figure 9.3ll
In many image analysis problems the ultimate aim is tq measure certain geometric
attributes of the object, such as the following:
•
1. Perimeter
(9.98)
where t is necessarily the boundary parameter but not necessarily its length.
2. Area
where rll and iJrll denote the object region and its boundary, respectively.
•
3. Radii Rmin , Rmex are the minimum and maximum distances, respectively, to
boundary from the center of mass (Fig. 9.37a). Sometimes the ratio R_IRmin
is used as a measure of eccentricity or elongation of the object.
R....
I«r)
A
c, , •
....-;:=:: Square A hll$ (. fold symmetry
Circle 8 is rotal,mally symmetric
C,. "...
Small circles C. have
4-fold symmetry
Triangles 4 haw :Hold symmetry
C3
•
Ie) Types of symmetry
r IK(t)1 dt
,
E "" 1. 1
(9.102)
, T 0
E"" i: k 00 la(k)1
2
(2;kr (9.103) .
8. Roundness, or compactness
(perimeter)2
-Y"" 4'lr(area) (9.104)
1. Center of
,
mass
, ,
N (m. nj E !II N (m. nj E 91
,
The (p, q) order central moments become
f!.p,q= LL
(m. nj E :II
(m -iiff(n -n)q •
(9.106)
r - - - - - ----
.
----- ----~--..,
. I
I I
I (m, n) _~I
I --
I n I
I I
I A 0 I
I
I 1
I - I
i m I
IL ' .-.JI
{a) Orientation '
8,
,
n
(b) BOundary rectangle
(1
---
- "
a
,
----l--m
,
1(8)= 2:L IJ2(m,n)= 2:L [(n -ii) cos8-(m '-ffi) sin e]' (9.107)
(m, n) ~ G£ (m, n) t 1lI
- .
The result is
. e = !1 tan-1[ !J.2,O2jJ-l,1
- 1J<J,2
] (9.108)
(3 = -x sin 6 + Y cos
, e
••
on the boundary-points and search f90 amin, "maXl f3mln' and 13~x. These give the
locations of point"s:~,,0i,Af, and :tt'ir; respectively, in Fig. 9.38b. From these
;'.' 'I'
. '"
"
'. .
• •
(9.110)
(m,n) E !if I
For the best-fit ellipse we want lmin = l:..m,l""" = l:n... which gives
. = (1) 1/4 (1:".,)3 118 = (1) 1/4 (I:"in)3 IIll ,
(9.112)
a 'IT I'.
mm
' b 'IT ['
max
5. Eccentricity
11 (""2.0 - floo.z)2 + 4""1, I'
E= '
area
Other representations of eccentricity are Rm.,IRmin, l:'•• II:"in, and alb.
The foregoing shape features are very useful in the design of vision systems for
object recognition.
9.11 TEXTURE
Statistical Approaches
Textures that are random in nature are well suited for statistical characterization,
for example, as realizations of random fields. Figure 9.40 lists several statistical
measures of texture. We discuss these. briefly next.
•
I·' t "
, ,
,
,;
1, , ,
, ,
,f
I,
,, ,i
I
r • ,,
,,'
,,
•
'" -- .
,
." ,,- "
,
, ."" /._"
..""",
.. -
o"-c' _1;';:;#';;:,ii i...; :,__.....__ ;G:",;- d
Classification of texture
The autocorrelation function lACF). The spatial size of the tonal prim.'
itives (i.e., texels) in texture can be represented by the width of the spatial ACF
r(k, I) = m2 (k, l)/m2 (0, 0) [see (9.7)]. The coarseness of texture is expected to be
proportional to the width of the ACF which can be represented by distances xo,Yo
such that r(xo, 0) = r(O,yo) = !. Other measures of spread of the ACF are obtained
•
,
Sec. 9.11 Texture '
395
•
where
D. .
fJ-l = 2: 2: mr(m, n), fJ-2 tl. 2: 2: nr(m, n)
m n m n
Features of special interest are the profile spreads M (2,0) and' M (0,2), the cross-
relation M (1,1), and the second-degree spread M (2,2). The calibration of the ACF
spread on a fine-coarse texture scale depends on the resolution of the image. This is
because a seemingly flat region (no texture) at a given resolution could appear as
fine texture at higher resolution and coarse texture at lower resolution. The ACF by
itself is not sufficient to distinguish among several texture fields because many
different image ensembles can have the same ACF. .
(9.116)
•
(9.117)
histograms.
A simple model for texture analysis is shown in Fig. 9.41a [51J. The texture
field is first decorrelated by a filter a(m, n), which can be designed from the knowl-
edge of the ACF. Thus if rem, n) is the ACF, then
tl .
u(m, n) ® a(m, n) = <-em, n) (9.120)
•
is an uncorrelated random field. From Chapter 6 (see Section 6.6) this means that
any WNDR of u(m, n) ,
would give an admissible whitening (ordecorrelating) filter.
ACF
measurement
, Texture
Texture u(m, n) features
Feature x
pattern • extraction
elm, n) uim, n)
. ,
•
., . (b) Texture synthesis using linear filters··
;, r ,..,
_;'."', ,r"
••
. : ,,,It!gw-e. 9;41 Random.texture
.,!".~. ,},' models,
_~ .1', ',i'
Such a filter is not unique, and it could have causal, semicausal, or noncausal .
structure. Since the edge extraction operators have a tendency to decorrelate im-
ages, these have been used -[51 J as alternatives to the true whitening filters. The
ACF features such as M (0, 2),M(2, 0), M(l, 1), and M(2, 2) [see (9.113)J and the
features of the first-order histogram of e(rn, n), such as average rnt, deviation ~,
skewness J.L3, and kurtosis J.L4 - 3, have been used as the elements of the texture
feature vector x in Fig. 9.41a.
Random field representations of texture have been considered using one-
dimensional time series as well as two-dimensional random field models (see [52],
[53] and bibliography of Chapter 6). Following Chapter 6, such models can be .
identified from the given data. The model coefficients are then used as features for
texture discrimination. Moreover these random field models can synthesize random
texture fields when driven by the uncorrelated random field e{rn, n) of known
probability density (Fig. 9,41b).
Example 9.6 Texture synthesis via causal and semicausal models
Figure 9.42a shows a given 256 x 256 grass texture. Using estimated covariances, a
(p, q) "" (3,4)-order white Gaussian noise-driven causal model was designed and used
to synthesize the texture of Fig. 9.42b. Figure 9A2e show. the texture synthesized via a
(p, q) "" (3,4) semicausal white noise-driven model. This model was designed via the
Wiener-Doob homomorphic factorization method of Section 6.8.
Purely structural textures are deterministic rexels, which repeat according to some
placement rules, deterministic or random. A texe! is isolated by identifying a group
.of pixels having certain invariant properties, which repeat in the given image. The
texel may be defined by its gray level, shape, or homogeneity of some local prop-
erty, such as size, orientation, or second-order histogram (concurrence matrix). The
•
placement rules define the spatial relationships between the texels, These spatial
relationships may be expressed in terms of adjacency, closest distance, period-
Is} Original gross texture Ibl Textura synthesized by lei Texture synthesized by
causal model semicausal modeJ
•
Figure !M2 ,Te¥ture synthesis using causal and semicausal models.
kirks, and so on, in the case of deterministic placement rules. In such cases the
texture is labeled as being strong.
For randomly placed texels, the associated texture is called weak and the
placement rules maybe expressed in terms of measures such as the following:
1. Edge density
2. Run lengths of maximally connected texels
3. Relative extrema density, which is the number of pixels per unit area showing
gray levels that are locally maxima or minima relative to their neighbors. For
example, a pixel u (m, n) is a relative minimum or a relative maximum if it is,
respectively, less than or greater than its nearest four neighbors. (In a region
of constant gray levels, which may be a plateau or a valley, each pixel counts as
an extremum.) • •
This definition does not distinguish between images having a few large
plateaus and those having many . single extrema. An.
alternative is to count
each plateau as one extremum. The height and the area of each extremum
may also be considered as features describing the texels.
Example 9.7 Synthesis for quasiperiodic textures
The raffia texture (Fig. 9.43a) can be viewed as a quasiperiodic repetition of a
deterministic pattern. The spatial covariance function of a small portion of the image
. was analyzed to estimate the periodicity and the randomness in the periodic rate. A
17 x 17 primitive was extracted from the parent texture and repeated according to the
quasiperiodic placement rule to give the image of Fig. 9.43b .
•
Other Approaches
A method that combines the statistical and the structural approaches is based on
what have been called mosaic models [55]. These models represent random geo-
metrical processes. For example, regular or random tessellations of a plane into
bounded convex polygons give rise to cell-structured textures. A mosaic model
.",'.. "
... - ~
.' -"1"''/;;':,'
--- '
" ~
- 1# __ ~
~,
••
•
,..
,,-
•-', -
. - ..., -..
• •
-
C __" - , " : , ,
,
.-- ,'.
~"~""''''-'"'
+•-"4_'_"
• ?
- .
,.'
•
,-
.....
--,,',,-
_,
-<
_._'
-~-'-~'
. .
.-
,-"\,
--
.. J_
t _ ,'".';
,._,},X,f.
~
~
Image Subtraction .
The presence of !( known object in a scene can be deteeted by searching for the
location of match between the object template u(m, n) and the scene y(m, n).
Template matching can be conducted by searching the displacement of u(rn, n),
where the mismatch energy is minimum. For a displacement (p, q), we define the
mismatch energy
"\\1
• '.. ,
•
,
.~
•
•
1
. a) Precontrast b) Pcstcontrast
•
,••
; ""
•
I,
I••
•
•
!•, •
,I•
•,•,
!
I
s
.,
c) Difference
-------.. "
111m - p, n - q)
where the equality occurs if and only if v(m, n) = au(m - p, n - q), where a is an
arbitrary constant and can be set equal to 1, This means the cross-correlation
c•• (p, q) attains the maximum value when the displaced position of the template
coincides with the observed image, Then, we obtain
where 'Yl and ~2 are the scale factors, (p', q 1 are the displacement coordinates, and
6 is the rotation angle of the observed image with respect to the template. In such
cases the cross-correlation function maxima have to be searched in the parameter
space (p', q', "It. ~2, (I). This can become quite impractical unless reasonable esti-
mates of ~h ~i,and 6 are given. '
The cross-correlation c•• (p, q) is also called the area correlation. It can be
evaluated either directly or as the inverse Fourier transform of ,
C•• (wt. (2) ~ ,:?'{c•• (p, q)} = V(WI. (2)U' (WI. (2) (9.127)
The direct computation of the area correlation is useful when the template is small.
Otherwise, a suitable-size FFT is employed to perform the Fourier transform calcu-
lations. Template matching is 'particularly efficient when the data is binary. In that
case, it is sufficient to search the minima of the total binary difference
which requires only the simple logical exclusive-OR operations. The quantity
'Yv. (p, q) gives the number of, pixels in the image that do not match with the
template at location (p, q). This algorithm is useful in recognition of printed
characters or objects characterized by known boundaries as in the inspection of
printed circuit boards.•
The matched filtering problem is to find a linear filter gem, n) that maximizes the
output signal-to-noise ratio (SNR)
SNR ~
2
Is (0, 0)1 ,
2: 2: E[lg(m, n)@11(m, n)1 2
] (9.130)
m n s(m,n)~g(m,n)@u(m -mo,n -no)
Here sim, n) represents the signal content in the filtered output gem, n)@v(m, n).
Following Problem 9.16, the matched filter frequency response is found to be
0(OOb 002)
U·
= S ((oot> 00i) [.(
) exp -J 00l mO+ 002 nO)] (9 • 131)
•
... OOb 002
Defining
v(m, n) a vim, n)@r... (m, n) (9.133)
which, according to (9.123), is c_u (rn + mo, n + no), the area correlation of vern, it)
with u (m + mo, n + no). If (mo, no) were known, then the SNR would be maximized
at im, n) = (0,0), as desired in (9.130) (show!). In practice ·these displacement
values are unknown. Therefore, we compute the' correlation Cpu (m, n) and search
for the location of maxima that gives (mo, no). Therefore, the matched filter can be
implemented as an area correlator with a preprocessing filter (Fig. 9.46a). Recall
from Section 6.7 (Eq. (6.91» that r... (m, n) would be proportional to the impulse
response of the minimum variance noncausal prediction error filter for a random
field with power spectral density S." (001, (02). For highly correlated random fields-
for instance, the usual monochrome images-r; (m, n) represents a high-pass filter.
For example, if the background has object-like power spectrum [see Section 2.11]
vim. nl
I
:
Min variance
nonceuse! !·Im. n)
•
I
I
1 IJ.2
-'" " 2
2
•
,
!
1-1 I,
,
3 • 3 3 ,
,, A .i.. •
•
I
{- 1"V 1 ,
!
;+ 1
,, •
,,,
!,
• ,
\
1+2 ,
•
-.& • •
, 1T",
•
,i
, ,
,
•
;+3 ,
,
• , •
;+4 I • •
,
, ,
;+5
I• ;
I,
•
(a)
11\" I
•
,
"~
(' ,
., .
~.
,F>
'-'i
Ii.,". ,
ft-
'~
I"
~, , ~;, , ,", , *, , , , ,~
xl Figure 9.47 Two-dimensional logarith-
I:;'''--'WM''';'''_'''' , t ...,::.:- ....... ,J' mic search. (a) The algorithm. (b) Exam-
pie Courtesy Stuart Wells, Herriott,Watt
(b) Univ. U.K. '
..
•
. {circled numbers) gives the location of the new center .for the next step. This
procedure continues until the plane of search reduces to a 3 x 3 size. In the final
step aU the nine locations are searched and the location corresponding to the
minimum gives the DMD. .
.If the direction of minimum distortion lies outside.;Y(p), the algorithm con-
verges to a point on the boundary that is closest to the DMD. This algorithm has
been found useful for estimating planar-motion of objects by measuring displace-
ments of local regions from one frame to another. Figure 9.47b shows the motion .
vectors detected in an underwater scene involving a diver and turbulent water flow.
vim. n)
ulm.nl u 1 1m, n)
o
u2(m.n)
o
v, 1m, nl
••
Figure 9.48 . Hierarchical search. Shaded area shows the region where' match oc-
curs. Dotted lines show regions searched.
•
. Sequential search. Another way of speeding up search is to compute the
cumulative error
and terminate the search at (i, j) if ep•q (i, j) exceeds some predetermined threshold.
The search may then be continued only in those directions where ep,q (i, j) is below a
threshold .:
Another possibility is to search in the i direction until a minimum is found and
then switch the search in the j direction. This search in alternating conjugate direc-
tions is continued until the location of the minimum remains unchanged.
Amplitude thresholding is useful whenever the amplitude features (see Section 9.2)
sufficiently characterize the object. The appropriate amplitude feature values are.
calibrated so that a given amplitude interval represents a unique object character-
istic. For example, the large amplitudes in the remotely sensed IR image of Fig.
7.10b represent low temperatures or high altitudes. Thresholding the high-intensity
values segments the cloud patterns (Fig. 7.lOd). Thresholding techniques are also
useful in segmentation of binary images such as printed documents, line drawings'
and graphics, multispectral and color images, X-ray images, and so on. Threshold
Figure!M9
m.
, n it
•
B X
Component Labeling
Pixel labeling. Suppose a binary image is .raster scanned left to right and
top to bottom. The current pixel, X (Fig. 9.51), IS labeled as belonging to either an
object (Is) or a hole (Os) by examining its connectivity to the neighbors A, B, C, and
D. For example, if X = 1, then it is assigned to the object(s) to which it is connected.
If there are two or more qualified objects, then those objects are declared to be
. equivalent and are merged. A new object label is assigned when a transition from Os
to an isolated 1 is detected. Once the pixel is labeled, the features of that object are
updated. At the end of the scan, features such as centroid, area, and perimeter are
. saved for each region of connected Is. .,.
Column 1 2 3 .
• 5 ~
a
A • Level
Object
1
A
2
B
1 2
A. B A
1 3
C
•
b Iel B C
• Flag.
101 8 C
e i IC2 A 8
ID2 A 8
• , ,
J n
\
II
b c d I
0 •
S 'e f g h I
J k m n I Data
t v 0 .p r a q
t u v
w w •
b belongs to the object A and is placed underneath a in the first column. Since c is of
different color, it is placed in a new column.. for an object labeled B. The run d is of
the same color as a and overlaps a. Since band d both overlap a, divergence is said to
have occurred, and a new column ofobject A is created, where d is placed. A divergence
flag IDI is set in this column to indicate that object B.has caused this divergence.
Also, the flag ID2 of B (column 2) is set toA to indicate B has caused divergence in
A. Similarly, convergence occurs when two or more runs of Os or Is in a given line
overlap with a run of same color in the previous line. Thus convergence occurs in
run u, which sets the convergence flags ICi to C in column 4 and IC2 to B in column
6. Similarly, w sets the convergence flag IC2 to A in column 2, and the column 5 is
labeled as belonging to object A.
In this manner, all the objects with different closed boundaries are segmented
in a single pass. The segmentation table gives the data relevant to each object. The
convergence and divergence flags also give the hierarchy structure of the object.
Since B causes divergence as well a, convergence in A and C has a similar relation-
ship with B, the objects A, D, and C are assigned levels 1, 2, and 3, respectively.
o
Example 9.9
A vision system based on run-length connectivity analysis is outlined in Fig. 9.S3a. The
•
input object is imaged and digitized to give a binary image. Figure 9.53b shows the
run-length representation of a key and its segmentation into the outer profile and the
• •
Boundary-Based Approaches
"-I \
I \
I X"Il.
. -
\
Cubic spline
,
fit
•
• •
• ,
• (bl •
1. Merge two regions !/tj and f7t j if wlPm> 61> where Pm = min(p;, }}), P; and}} are
the perimeters of !/t i and f7t}, and w' is the number of weak boundary locations
(pixels on either side have their magnitude difference less than some
.
threshold'
.
(1). The parameter 01 controls the size of the region to be merged. For example
6) = 1 implies two regions will be merged only if one of the regions almost
surrounds the other. Typically, 01 = 0.5.
2. Merge !/ti and !/tj if wll > 62 , where I is the length of the common boundary
between the two regions. Typically 62 = 0.75. So the two regions are merged if
the boundary is sufficiently weak. Often this step is applied after the first
heuristic has been used .to reduce the numberof regions. .
3. Merge flti and f7t] only if there are no strong edge points between them. Note
that the run-length connectivity method for binary images can be interpreted
as an example of this heuristic.
4. Merge Ge i and Gej if their similarity distance [see Section 9.14] is less than a
threshold.
01, I . 01 2
/
.r-
I/'\.
/ __ .---1 ~ Merge
•
•
• ,
A I B \, A 1 B
,
- I
Split D C I D C Merge
. .. - -- I--B •
A
4
1 D I C
3
(a) Input
< ,
1 2 3 4
lC,2D,M 11A, 8, D)
, ,
21A, 8, C)
~ ]) ~
31B. C, D)
cc2
B
v B D A C
=v 4
Template Matching
Texture Segmentation
, •,
•
;
• •f•
• I,t..,
A major task after feature extraction is to classify the object into one of several
categories. Figure 9.2 lists various classification techniques applicable in image
analysis. Although an in-depth discussion of classification techniques can be found
in the pattern-recognition literature see, for example, [l}-we will briefly review
these here to establish their relevance in image analysis.
It should be mentioned that classification and segmentation processes have
closely related objectives. Classification can lead to segmentation, and vice-versa.
Classification of pixels in an image is another form of component labeling that can
result in segmentation of various objects in the image. For example, in remote'
sensing, classification of multispectral data at each pixel location results in segmen-
tation of various regions of wheat, barley, rice, and the like. Similarly, image
segmentation by template matching, as in character recognition, leads to classifica-
tion or identification of each object. .
There are two basic approaches to classification, supervised and nonsuper-
. vised, depending on whether or not a set of prototypes is available.
Supervised Learning
Such a function arises, for example, when x is classified to the class whose centroid
is nearest in Euclidean distance to it (Problem 9.17). The associated classifier is
called the minimum mean (Euclidean) distance classifier.
An alternative decision rule is to classify x to S, if among a total of k nearest
prototype neighbors of x, the maximum number of neighbors belong to class Si. This
is the k-nearest neighbor classifier, which for k = 1 becomes a minimum-distance
classifier.
When the discriminant function can classify the prototypes correctly for some
linear discriminants, the classes are said to be linearly separable. In that case, the
weights ak and bk can be determined via a successive linear training algorithm. Other
discriminants can be piecewise linear, quadratic, or polynomial functions. The
k-nearest neighbor classification can be shown to be equivalent to using piecewise
linear discriminants.
Decision tree classification [60-61]. Another distribution-free classifier,
called a decision tree classifier, splits the N-dimensional feature space into unique
regions by a sequential method. The algorithm is such that every class need not be
tested to arrive at a decision. This becomes advantageous when the number of
classes is very large. Moreover, unlike many other training algorithms, this algo-
rithm is guaranteed to converge whether or not the feature space is linearly sepa-
rable. •
Let !J.k (i) and Uk (i) denote the mean and standard deviation, respectively,
measured from repeated independent observations of the kth prototype vector
element y};) (i), m "" I, ... ,Mk • Define the normalized average prototype features
Zk (i) ~ !J.k (i)/uk(i) and an N X K matrix
•
42(1) zk(l)
z2(2) zk(2)
••
•
(9.140)
zl(N) z2(N) ... zk(N)
The row number of Z is the feature number and the columnnumber is the object or
class number. Further, let Z' ~ (Z] denote the matrix obtained by arranging the
elements of each row of Z in increasing order with the smallest element on the left
and the largest on the right. Now, the algorithm is as follows.
Decision Tree Algorithm
Step 1 Convert Z to Z'. Find the maximum distance between adjacent row .
elements in each row of Z'. Find r, the row number with the largest maximum
distance. The row r represents a feature. Set a threshold at the- midpoint of the
maximum distance boundarill.s and split row r into two parts.
Step 2 Convert Z' to Z such that the row r is the same in both the matrices.
The elements of the other rows of Z' are rearranged such that each column of Z
represents a prototype vector. This means, simply, that the elements of each row of
Z are in the same order as the elements of row r. Split Z into two rnatricesZ, and Zz
by splitting each row in a manner similar to row r.
, Step 3 Repeat Steps 1 and 2 for the split matrices that have more than one
column. Terminate the process when all the split matrices have only one column.
The largest adjacent difference in the first row is 8; in the second row it is 7. Hence the
.
first row is chosen, and z(l) is the feature to be thresholded. This splits Z, into Z. and
-
Z" as shown. Proceeding similarly with these matrices, we get .
The thresholds partition the feature space and induce the decision tree, as shown in
Fig. 9.58.
•
.1'(2)
1
No 1, {
56
- .
Yes 52 ,....
Yes 5
s, - •
No 44 - 3
- •
36 - •4
Yes
-,.... 2
28 •
No S3
20
- I I I I I
zil)
0 8 16 24
• (9.141)
where C;,k is the cost of assigning x to S« when x E Si in fact and Rc represents the
region of the feature space where P (xiS.) > p (xiS;), for every i k: The quantity '*
C(XiSk) represents the total cost of assigning x to Sk' It is well known the decision
rule that minimizes 9't is given by
. K K
(9,143)
In this case the probability of error in classification is also minimized and the
minimum error classifier discriminant becomes
(9.144)
In practice the P(XISk) are estimated from the prototype data by either parametric
or nonparametric techniques which can yield simplified expressions for the discrimi-
nant function.
There .also exist-some sequential classification techniques such as sequential
probability ratio test (SPRT) and generalized SPRT, where decisions can be made
initially using fewer than N features and refined as more features are acquired
sequentially [62]. The advantage lies in situations where N is large, so that it is
,
,
/--." ~'
, "'-+. - ...%!,
"""'. ~ lv1iir~-~,~~iI,i: '. ;l,,,,',...
--......._ _..............
(a) (b) •
.,
: e-'
(e) (d)
Figure 9.59 Segmentation by clustering. (a) Input images Ut (m. n) and u,(m, n);
(b) feature images V, ('II. n) and v, (m, n); (e) segmenation of clouds by thresholding .
V, (left) and by clustering (right): (d) segmentation of land by thresholding v, (left)
and by clustering (right). •
,
•
100 • •
•
~
•
• •
, • •
i\ • • if •
•
r-, /
•
120 ,
I <t
60
, •
•
•
m. •
•
• •
•
• ,
.~
,. • Cluster 1
• •
• •
•
I
'\
1'- ;f'
o 60 1m 180 240
•
No
Chain method [63J. The first data sample is designated as the representative
of the first cluster and similarity or distance of the next sample is measured from the
first cluster representative. If this distance is less than a threshold, say", then it is
placed in the first cluster; otherwise it becomes the representative of the second
cluster. The process is continued for each new data sample until all the data has
been exhausted. Note that this is a one-pass method.
The procedure is repeated for each Xi> one at a time, until the clusters' and their
centers remain unchanged. If d (x, y) is the Euclidean distance, then a cluster center
is simply the mean location of its elements. If K is not known, we start with a large
" . \,
•
., • S. k:l ..... K
I; ,
, Classification
, kth class
• •
,
Image Feature.
•
Feature
extraction
Symhols, Description
• \ Symbolic ,
• representation I, Interpretation
• •
•
'. •
,
Vi.ual Look up"
model." In tables
•
Figure 9.62 Image understanding systems.
•
420 Image Analysis and Computer Vision. Chap. 9
value of K and then merge to K - 1, K - 2, ... clusters by a suitable cluster-distance
measure.
•
Other Methods
Clusters can also be viewed as being located at the nodes of the joint Nth-order
histogram of the feature vector. Other clustering methods are based of! statistical
nonsupervised learning techniques, ranking, and intrinsic dimensionality determi-
nation, graph theory, and so on [65, 66]. Discussion of those techniques is beyond •
•
the goals of this text. •
•
Finally it should be noted that success of clustering techniques is closely tied
to feature selection. Ousters not detected in a given feature space may be easier to
detect in rotated, sealed, or transformed coordinates. For images the feature vector
elements could represent gray level, gradient magnitude, gradient phase, color,
.and/or other attributes.. It may also be useful to decorrelate the elements of the
feature vector.
(aJ (b)
,
leI
• Figure 9,63 A rule-based approach for printed circuit board inspection, (a) Pre-
processed image; (b) image after thinning and identifying tracks and pads; (c)
, .' '~egmented image (obtained by region growing). Rules can be applied to the image
... m (c) and violations can be detected.. .
pie, the shape features may be mapped into the symbols representing circles,
rectangles, ellipses, and the like. Interpretation is provided to the collection of
symbols to develop a description of the scene. To provide interpretation, different
visual models and practical rules are adopted. For example, syntactic techniques .
provide grammars for strings of symbols. Other relational models provide rules for
describing relations and interconnections between symbols. For example. pro-
jections at different angles of a spherical object may be symbolically represented as
several circles. A relational model would provide the interpretation of a sphere or a
ball. Figure 9.63 shows an example of image understanding applied to inspection of
printed circuit boards [73, 74].
Much work remains to be done in formulation of problems and development
of techniques for image understanding. Although the closing topic for this chapter,
it offers a new beginning to a researcher interested in computer vision.
•
PROBLEMS
•
9.1 Calculate the means, autocorrelation, covariance, and inertia [see Eq. (9.116)J of the
second-order histogram considered in Example 9.1.
9.2* Display the following features measured over 3 x 3, 5 x 5, 9 x 9, and 16 x 16 windows
of a 512 x 512 image: (a) mean, (b) median, (c) dispersion, (d) standard deviation, (e)
entropy, (f) skewness, and (g) kurtosis. Repeat the experiment for different images
-. and draw conclusions about the possible use of these features in image processing
applications.
9.3* From an image of your choice, extract the horizontal, vertical, 30°, 45°, and 60° edges,
using the DFT and extract texture using the Haar or any other transform. .
9.4* Compare the performances of the gradient operators of Table 9.2 and the 5 x 5
stochastic gradient of Table 9.5 on a noisy ideal edge model (Fig. 9.11) image with
SNR = 9. Use the performance criteria of (9.25) and (9.26). Repeat the results at
different noise levels and plot performance index versus SNR.
9.5* Evaluate the performance of zero-crossing operators on suitable noiseless and noisy
images. Compare results with the gradient operators.
9.6 Consider a linear filter whose impulse response is the second derivative of the
Gaussian kernel exp( -x 2/20' 2). Show that, regardless of the value of (Y, the response of
this filter to an edge modeled by a step function, is a signal whose zero-crossing is at
the location of the edge. Generalize this result in two dimensions by considering the
Laplacian of the Gaussian kernel exp[-(x 2 + y2)/2(T2].
9,7 The gradient magnitude and contour directions of a 4 x 6 image are shown in Fig.
P9.7. Using the linkage rules of Fig. 9.16b, sketch the graph interpretation and find the
edge path if the evaluation function represents the sum of edge gradient magnitudes.
Apply dynamic programming to Fig. P9.7 to determine the edge curve using the
criterion of Eq. (9.27) with co: = 411r, 13 '" 1 and d(x, y) = Euclidean distance between x
and y.
9.8 a, Find the Hough transforms of the figures shown below in Figure P9.8.
b. {Generalized Hough transform] Suppose it is desired to detect a curve defined
•
422 Image AnalV$is and Computer Vision Chap. 9
,
5/ 4
' . "'-...,4 -,
4
,
5
. 3/
4
•• 3/
5
. 5
•
5
. 4"'-...,
4
.
4
•
4
•
4
..
4 "'. / 3
/ 3
/ 3 .
Figure P9.7
Figure P9,8
If xo,. Yo, a, and b are represented by 8-bit words each, what is the dimension of
C(a)?
c. If the gradient angles ll, at each edge point are given, then show how the relation
ilq, iJq,
- +- tan6=O for (x, y, 6) '" (Xi'Yill,)
ilx ily •
LBo,.(t)dt '" 1
0;.,
and L
j-e
Bo,.(t + j) = 1, O<t < 1
•
b. If an object of uniform density is approximated by the polygon Obtained by joining
the ,adjacent control points by straight lines, find the expressions ic« Center of mass,
perimeter, area, and moments in terms of the control points. '
Problems Chap. 9
9.10 (Cubic Bssplines} Show that the control points and the cubic B-splines sampled at
uniformly spaced nodes are related via the matrices 13" as follows:
41 1 0
1 4
,
Q.
6
°1
1 4 0 •
I
B4=~ , ;;
1
1 t) 1 4
•
where the first matrix is for the periodic case and the second is for the nonperiodic
case.
9.11 (Properties of FDs)
a. Prove the properties of the Fourier descriptors summarized in Table 9.8.
b. Using Fig. 9.26, show that the reflection of x,, x, is given by
•
x, = A 2 1 B 2 [-ZABxl
+
+ (A' - B 2
)x2 - 2BC]
k >O,to~O
v,
L- +-x,
(aJ· (oj
•
• Figure P9.12
b. (Line patterns) If the given curve is a line pattern then a closed contour can be
•
obtained by retracing it. Using the symmetry of the periodic curve, show that the
FDs satisfy the relation .
,
a(k) "" a( _k)e-jk(2~1T)~
for some [3, If the trace begins at t "" 0 at one of the endpoints of the pattern, then
[3 = 0, Show how this property may be used to skeletonize a shape, .
c. The area A enclosed by the outer boundary of a surface is given by
Ii 1f·· IJrT dx,(t) I rT
dx;
A ='j'x1dx,-. x ,dx,='i ,.ox,(t) dt dt -'),.ox, (t)-dt dt
00
In terms ofFDs show that A = - L la(k)i' k-n, Verify this result for surface area of
, .,
' .. - ' k_1$f
. '
a line pattern,
9.13 (Properties ofAR models)
a. Prove the translation, scaling, and rotation properties of AR model parameters
listed in Table 9.9.
b. Show a closed boundary can lie reconstructed from the AR model residuals Ej (n)
by inverting a circulant matrix.
c. Find the relation between AR model features and FDs of closed boundaries,
9.14* Scan and digitize the ASCII characters and find their medial axis transforms, Develop
any alternative practical thinning algorithm to reduce printed characters to line
shapes, .
9.15 Compare the complexity of printed character recognition algorithms based on (a)
template matching, (b) Fourier descriptors, and (c) moment matching,
9.16 (Matched filtering) Write the matched filter output SNR as
where G and U are Fourier transforms of gem, n), u(m, n), respectively, Apply the
Schwartz inequality to show that SNR is maximized only when (9,132) is satisfied
within a scaling constant that can be set to unity. What is the maximum value of SNR?
9.17 If Jl.k denotes the mean vector of class k prototypes, show that the decision rule:
I!x - !LkW < IIx - !L111 , i 4- k :;, x E Sk' gives a linear discriminant with a, = 2jJ.k,
2
bk = -11p,k!f,
9.18 Find the decision tree of Example 9,11 if an object class with z(l) = 15, z(2) = 30 is
added to the list of prototypes.
9.19* A printed circuit board can be modeled as a network of pathways that either merge
into other paths as terminate into a node. Develop a vision system for isolating defects
such as breaks (open circuits) and leaks (short circuits) in the pathways. Discuss and
develop practical preprocessing; segmentation, and recognition algorithms for your
system.
BIBLIOGRAPHY.
Section 9.1-9.3 •
1. R. O. Duda and P. E. Hart. Pattern Recognition and Scene Analysis. New York: John
Wiley, 1973.
2. A. Rosenfeld and A. C. Kak, Digital Picture Processing. New York: Academic Press,
1976. Also see Vols. I and II, 1982.
3. D. H. Ballard and C. M. Brown. Computer Vision. Englewood Cliffs, N.J.: Prentice-
Hall, 1982.
4. B. S. Lipkin and A. Rosenfeld (eds.). Picture Processing and Psychopictorics. New York:
•
Academic Press, 1970.
. 5. J. K. Aggarwal, R. O. Duda and A. Rosenfeld (eds.), Computer Methods in Image
Analysis. Los Angeles: IEEE Computer Society, 1977.
Section 9.4
• • •
Section 9.6
•
For chain codes, its generalizations, and run-length coding based segmentation
approaches we follow:
The theory of B -splines is well documented in the literature. For its applica-
tions in computer graphics: '
•
Section 9.7 •
34. H. Samet. "Region Representation: Quadtrees from Boundary Codes." Comm. ACM
23 (March 1980): 163-170.
Section 9.8
•
For surveys and further details on texture, see Hawkins in [4], Picket in [4}, Haralick
et at in [5], and:
•
46. P. Brodatz, Textures: A Photographic Album for Artists and Designers, Toronto: Dover
Publishing Co" 1966.
47. R. M. Haralick. "Statistical and Structural Approaches to Texture." Proc. IEEE 67
(May 1979):
,
" ,
786-809. Also see Image Texture Analysis, New York: Plenum, 1981.
48. G. G. Lendaris and G. L. Stanley. "Diffraction Pattern Sampling for Automatic Pattern
Recognition," in [5}.
49. R. P. Kruger, W. B. Thompson, and A. F. Turner. "Computer Diagnosisof Pneumo-
coniosis:' IEEE Trans. Sys. Man, Cybern. SMC-4, (January 1974): 40-49.
50. B. Julesz, et at "Inability of Humans to Discriminate Between Visual Textures that
Agree in Second Order Statistics-Revisited." Perception 2 (1973): 391-405. Also see
IRE Trans. Inform. .Theory IT-8 (February
.
1962): 84-92.
51. O. D. Faugeraus and W. K. Pratt. "Decorrelation Methods of Texture Feature Extrac-
tion." IEEE Trans. Pattern Anal. Mach. Inieli. PAMI-2 (July 1980): 323-332.
52. B. H. McCormick and S. N. Jayaramamurthy. "Time Series Model for Texture Syn.
thesis." Int. J. Comput, Inform. Sci 3 (1974): 329-343. Also see vol, 4, (1975): 1-38.. .
53. G. R. Cross and A. K. Jain. "MarkovRandom Field Texture Models." IEEE Trans.
Pattern Anal. Mach. Iraell. PAMI-5, no. I.(January 1983): 25-39.
54. T. Pavlidis. Structural Pattern Recognition. New York: Springer-Verlag, 1977.
• •
55. N. Ahuja and A. Rosenfeld. "Mosaic Models for Textures." IEEE Trans. Pattern Ana/.
Mach. Intell. PAMI-3, no. 1 (January 1981): 1-11.
, Section 9.12
56. G. L. Turin. "An Introduction to Matched Filtering." IRE Trans. Inform. Theory
(June 1960): 311-329.
57. A,. Vander Lugt, F. B. Rotz, and A. Klooster, Jr. "Character Reading by Optical Spatial
Filtering." in J. Tippett et al. (eds), Optical and Electro-Optical Information Processing.
Cambridge, Mass.: MIT Press, 1965, pp. 125-141. Also seepp, 5-11 in [5].
58. J. R. Jain and A. K. Jain. "Displacement Measurement and Its Application in Inter-
frame Image Coding." IEEE Trans. Common COM·29 (December 1981): 1799-1898.
59. D. L. Barnea and H. F. Silverman. "A Class of Algorithms for Fast Digital Image
Registration." IEEE Trans. Computers (February 1972):'179-186.
Details of classification and clustering techniques may be. found in [1] and other
texts on pattern recognition. For decision tree algorithm and other segmentation
techniques:
60. C. Rosen et al. "Exploratory Research in Advanced Automation." SRI Technical Report
First, Second and Third Reports, NSF Grant GI-38100Xl, SRI Project 2591, Menlo
Park, Calif.: SRI, December 1974. .
68. E. B. Henrichon, Jr. and K. S. Fu. "A Nonpararnetric Partitioning Procedure for Pattern
Classification." IEEE Trans. Computers C-18, no. 7 (July 1969).
69. I. Kabir. "A Computer Vision System Using Fast, One Pass Algorithms," M.S. Thesis,
University of California at Davis, 1983.
70. G. Hirzinger and K. Landzattel. "A Fast Technique for Segmentation and Recognition
of Binary Patterns." IEEE Conference on Pattern Recognition and Image Processing,
1981. .
71. D. W. Paglieroni. "Control Point Algorithms for Contour Processing and Shape Analy-
sis," Ph.D. Thesis, University of California, Davis, 1986.
72. C. R. Brice and C. L. Fennema. "Scene Analysis Using Regions," in (5]. .
•
Section 9,15
•
73. A. Darwish and A. K. Jain. "A Rule Based Approach for Visual Pattern Inspection."
IEEE Trans. Pattern Anal. Mach. Intell: PAMI-10, no, 1 (January 1988): 56-68.
74. J. R. Mandeville. "A Novel Method for Analysis of Printed Circuit Images," IBM J. Res.
Dev. 29 (January 1985): 73-86.
•
•
•
•
•
Image Reconstruction
from Projections
-
10.1 INTRODUCTION
For X-ray cr scanners, a simple model of the detected image is obtained as follows.
Let f(x, y) denote the absorption coefficient of the object at a point (x, y) in a slice
• at some fixed value of z (Fig. 10.1). Assuming the illumination to consist of an
infinitely thin parallel beam of X-rays, the intensity of the detected beam is given by
• •
431
z
•
v, 0
•
"-
--
•
r ,
• I I
\
• 0
J .
~ f \ a.
I \
X-rays V tt;x, y) ..... \
I
A
\
"- "
rei';
~
is
i:;i
\
/ \ I K~w. .
--
0
\ /
Typical projection
7- ~ •
Display
Source Object Detectors . Computer slice
,c . ,.
./ viev~
-
/
"~ L
where 10 is the intensity of the incident beam, L is the path of the ray,' and u is the
distance along L (Fig. 10.2). Defining the observed signal as
(10.2)
where (s, 8) represent the coordinates of the X-ray relative to the object. The image
reconstruction problem is to determine f(x, y) from g(s, 8). In practice we can only
estimate f(x,y) because only a finite number of views of g(s, (I) are available. The
preceding imaging technique is called transmission tomography because the trans-
mission characteristics of the object are being imaged. Figure 10.1 also shows an
X-ray CT scan of a dog's thorax, that is, a cross-section slice, reconstructed from
120 such projections. -X-ray CT scanners are used in medical imaging and non-
destructive testing of l}1echanical objects.
Reflection Tomography
There are other situations where the detected image is related to the object by a
transformation equivalent to (10.3). For example, in radar imaging we often obtain
,...,'--_,..-_ _ x
fIx. y)
X~ray source
Emission Tomography
Another important situation where the image reconstruction problem arises is ifI
magnetic resonance imaging (MRI). t Being noninvasive, it is becoming increasingly
attractive in medical imaging for measuring (most commonly) the density of protons
(that is, hydrogen nuclei) in tissue. This imaging technique is based on the funda-
mental property that protons (and all other nuclei that have an odd number of
.protom or neutrons) possess a magnetic moment and spin. When placed in a mag-
netic field, the proton precesses about the magnetic field in a manner analogous to a
top spinning about the earth's gravitational field. Initially the protons are aligned
either parallel or antiparallel to the magnetic field. When an RF signal having an
appropriate strength and, frequency is applied to the object, the protons absorb
energy, and more of them switch to the antiparalJel state. When the applied RF
• signal is removed, the absorbed energy is reemitted and is detected by an RF .
receiver. The proton density and environment can be determined from the charac-
teristics of this detected signal. By controlling the applied RF signal and the sur-
rounding magnetic field, these events can be made to occur along only one line'
within the object. The detected signal is then a function of the line integral of the
MRI signal in the object. In fact, it can be shown that the detected signal is the
Fourier transform of the projection at a given angle [8, 9]. .
Definition
The Radon transform of a function f(x, y), denoted as g(s, 6), is defined as its line
integral along a line inclined at an angle 6 from the j--axis and at a distance s from
.
t Also called nuclear magnetic resonance (NMR) imaging. To emphasize its noninvasive features, the
word nuclear is being dropped by manufacturers of such imaging systems to avoid confusion with nuclear
reactions associated with nuclear energy and radioactivity.
. -oo<s <oo,ose<'iT
The symbol rJl, denoting the Radon transform operator, is also called the projection
operator. The function g(8, e), the Radon transform of I(x, y), is the one-dimen-
sional projection of I(x, y) at an angle 6. In the rotated coordinate system (s, u),
where
s v- x cos9+y sin9 x ""scos9-u sin 6
or (10.5)
u "" -x sin a+ y cos 6 y =s sinO+u cos9
(10.4) can be expressed as .
The quantity g(s, e) is also called a ray-sum, since it represents the summation of
I(x, y) along a ray at a distance s and at an angle 9.
The Radon transform maps the spatial domain (x, y) to the domain (s, 0).
Each point in the (s, 6) space corresponds to a line in the spatial domain (x, y). Note .
that (s, 6) are not the polar coordinates of (x, y), In fact, if (r, $) are the polar
coordinates of (x, y), that is,
x == r cos 4>, y "" r sin 4> (10.7)
then from Fig. 10.3a
s == r cos(e - $) (10.8)
For a fixed point (r, $), this equation gives the locus of all the points in (8, 0), which
is a sinusoid as shown in Fig. 10.3b. Recall from section 9.5 that the coordinate pair
(8, 6) is also the Hough transform of the straight line in Fig. 1O.3a.
Example 10.1
Consider a plane wave,f(x,y) "" exp[j2'IT(4x + 3y)]. Then its projection function is
. ~
•
where!'(O) ~ df(fJ)/d6 and Ok, k "'1,2,. 0 0' are the roots of{(9).
•
• •
o ,
-r r •
!'
~"- -- -_._-.-
.~ ~. ~
•
~ ••
, ... .. f
•
-
x
lei An image and its Radon transform
Figure 10.3 Spatial and Radon transform domains.
•
Notation
In order to avoid confusion between functions defined in different coordinates, we
adopt the following notation. Let 'II be the spaceof functions defined on lR2, where
R denotes the real line. The two-dimensional Fourier transform pair for a function
[(x, y) E i't is denoted by the relation .
. 72
. . . ' f(x, y) -_)0 F(~ 1, ~z)
~(
•
(10.10)
In polar coordinates we write
, •
Let Vbe the space of functions defined on JR x [0,11"]. The one-dimensional Fourier
transform of a function g(s, 6) E 'I) is defined with respect to the variable s and is
indicated as
.71
g(s, e) ~.---:_~ G(§, e) (10.13) .
s
The inner product in V is defined as
(10.14)
For simplicity we will generally consider 'tt and '1/ to be spaces of real functions.
The notation
•
g = r;¥f (10.15)
will be used to denote the Radon transform of f(x, y), where it will be understood
thatf E 'It,g E V.
The. Radon transform is linear and has several useful properties (Table 10.1), which
can be summarized as follows. The projections g(s, It) are space-limited in s if the
object f(x, y) is space-limited in (x, y), and are periodic in e with period 211". A
translation of [(x, y) results in the shift of g(s, It) by a distance equal to the pro-
1.0....--...,....--,--..,----,.--.,.-----,--·-,..--..,----,----,
• !
0.81---/---+----+-:
• •
7 e
0.41---t-----J r---+I7-~----1--,
~
0.21---1--+1--1--++----+'''<-" r - - t - -
I' V f
o 1---I--1+-I---+---t---H-+--+---,f--f---t--t+-1---1
t
Ie
\1\
. h
i j
-0.4 f---j---/-' -+-_.....,....,1-9WBt-1I---1--+-.HY----+--c--i
I '/
•
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
•
(al
\
\
(bl
figure 10.4 (a) Head phantom model; (b) constant-density ellipise,f(x; y) =f. for
(x'/a') + (Y'lb') S 1.
jection of the translation vector on the line s = x cos I) + y sin e. A rotation of the
object by an angle eo causes a translation of its Radon transform in the variable e. A
scaling of the (z, y) coordinates of f(x, y) results in scaling of the s coordinate
together with anamplitude scaling of g (s, I). Finally, the total mass of a distribution
f(x,y) is preserved by g(s, 6) for aile.
Example 10.2 Computer generation of projections of a phantom
In the development and evaluation of reconstruction algorithms, it is useful to simulate
projection data corresponding to an idealized object. Figure 10.4a shows an object
composed of ellipses, which is intended to model the human head [18, 21]. Table 10.2
gives the parameters of the component ellipses. For the ellipse shown in Fig. 10Ab, the
projection at an angle a is given by ,
properties of the Radon transform, the projection function for the object of Fig. 10.4a
can be calculated (see Fig. 1O.13a).
Definition
Associated with the Radon transform is the back-projection operator (;,f, which is
defined as
,
•
Back projection represents the accumulation of the ray-sums ofall of the rays that .
. pass through the point (z, y) or (r, $). For
.
example, if , ' ,
.Remarks
The back-projection operator f]J maps a function of (.I', El) coordinates into a func-
tion of spatial coordinates (x, y) or (r, 4». '
The back-projection b(x,y) at any pixel (x,y) requires projections from all
directions. This is evident from (10.16). • .
0, Is)
s,
IJ, (s)
y
I,
I
I
I
.. t
t
<, I
<,
<, I
0 I S2
¢, 0
•
,
•
• •
• •
• •
f g(s.8)
-
tt«, y)
c
•
2(1 + sin21T~os) 2
Shepp-Logan I~l sinc(~d) rect(~d) ,
2 2 2
11' (d - 4s') 1T d (1 - 4m )
Low-pass
cosine
lei cos('lIed) rect(ed) ![hRL(S - ~) + hRL(S + ~)1 ![hRdm -!) + h"dm + i)1
where .':7 2 denotes the two-dimensional Fourier transform operator. In practice the
filter I~l is replaced by a pnysically realizable approximation (see Table 10.3). This
method [16] is appealing because the filtering operations can be implemented
approximately via the FFT. However, it has two major difficulties. First, the Fourier
domain computation of I~JF;,(t, 6) gives F(O, 0) := 0, which yields the total density
fJf(x, y) dxdy := O. Second.since the support of (fig is unbounded,f(x, y) has to be ,
computed over a region much larger than the region of support of f(x, y). A better
algorithm, which follows from the projection theorem discussed next, reverses the
• order of filtering and back-projection operations and is more attractive for practical
implementations.
•
•
10.4 THE PROJECTION THEOREM [5-7, 12, 13]
•
Figure 10.7 shows the meaning of this result. This theorem is also called the
projection-slice theorem.
Proof Using (10.6) in the definition of G (t. 6), we can write
Performing the coordinate transformation from (8, u) to (x, y), [see (10.5)]. this
becomes
Remarks
.
From the symmetry property of Table 10.1, we find that the Fourier transform slice
also satisfies a similar property
G( -~.9 + 'If) = G(E. 9) (10.24)
,
Iff(x,y) is bandlimited, tben so are the projections. This follows immediately
from tbe projection theorem. ' .
An important consequence of the projection theorem is the following result.
•
• •
(x. y)
& W,
<il .' (s)
,
'.
= (IcosO.
1 I) exp( j8'lT~)'o(4 tan e- 3) = G)ejH>,,, 0(6 - 4»
,cos. .
•
•
•
Inverse Radon Transform Theorem. Given g(s, fJ) ~ Uf, -0::<$ <rt>,
os; !l < 'lI', its inverse Radon transform is .
f( x, y ) "" ( 2'l1'2
1) J" r~_~x cos(ClgfBS)(S:
0 J
6)] ds de
a + y sm e- s (10.26)
In polar coordinates . •
a..
fp(r, 4»=f(r c054>,r sm4»- 2'l1'2 o
. _( 1 ) {"J" [(agfas)(s, e)] .
J _~r cos(e-4»-sdsd9 (10.21) ,
where (1/j271")[ag(s, e)!as] and (-1/j271"s) are the Fourier inverses of ~G(~, 13) and
sgn(g), respectively; Combining (10.29) and (10.31), we obtain the desired result of
(10.26). Equation (10.27) is arrived at by the change ofcoordinates x = r cos 4> and
y = r sin 4>.
Remarks.
The inverse Radon transform is obtained in two steps (Fig. 10.8a). First, each
projection g (s, ll) is filtered by a one-dimensional filter whose frequency response is
Igl. The result, 9(s, e), is then back-projected to yield f(x, y). The filtering operation
can be performed either in the s domain orin the £ domain. This process yields two
different methods offinding !;('-l, which are discussed shortly.
. The integrands in (10.26), (10.27), and (10.31) have singularities. Therefore,
the Cauchy principal value should be taken (via contour integration) in evaluating
the integrals.
•
Definition. The Hilbert transform of a function 4>(1) is defined as
l/1(s) ~ .5f(j> ~ 4>(s)@(.l)=(1.)Ioo4>(t)dt(10.32)
7I"S 11' -'X S - t
,. .
The symbol .91' represents the Hilbert transform operator. From this definition it .
follows that g(s, e) is the Hilbert transform of (1/271")ag(s, e)/as for each e.
• Because the back-projection operation is required for finding .<J[ -1, the recon-
structed image pixel at (x, y) requires projections from all directions.
I
gls, (II I Differentiate Hilbert
transform
! gIs, III Back·proj<",t fIx, yl
I
I
..!...:o
2,. -
3'C
I
I
<II
IL --'I
!
Filter Inverse
A
gl$'.81 Fourier G(t. III Fourier g(s,8J aack·project fIx, yl
, transform X transform
9',
It' 9' ,1 <II
,
(c) Filter back-projection method
•
Figure 10.8 Inverse radon transform methods.
(10.33)
!(x,y)= ;~ ff:ell""'(xCOSO+YSinO-sr'll(O-",)d9dr
= (f-)
J1r
f~ e11O'''[s - (x cos'" + Y sin4>W 'dr
-~ .
Since the Fourier inverse of 1/(~ - a) is j1re iZ"a' sgn(t), the preceding integral becomes
{(x, y) = exp[j21T(X cos e + y sin<l»t] sgn(t)l, _, = exp[jlO1r(x cos <1>'1 y sin",)).
,...-----, , ,
{lx. yJ 2-0 filter flx. yJ {(x. yl g(s.O) 1-0 liltern q(s. OJ fIx. yl
-
Alt,. t21
,"'" (ll
Ap lt . 61
(ll
Domain
, h
fIx, yl g(s.OI 1-0 filte!'!l gls.OI fix, yl
= (ll &
•
IEI Ap(E. 81 <
•
•
ter implementation.
,
•
a one-dimensional filter whose frequency response is Ap (~, 6) and then taking the 0
inverse Radon transform of th~ result. Using the representation of yz-l in Fig.
1O.8a, we obtain a generalized filter-back-projection algorithm, where the filter now
becomes i~IAp (~, 6). Hence, the two-dimensional filter A.( ~ 1 , ~ 2) can be imple-
mented as
• 0 0
•
. .. •
0
The foregoing results are useful for developing practical image reconstruction
algorithms. We now discuss various considerations for digital implementation of
these algorithms. 0
Sampling Considerations
• (10.40)
Choice of Filters . •
•
,
The filter function I~I required for the inverse Radon transform emphasizes the
high-spatial frequencies. Since most practical images have a low SNR at high fre-
o quencies, the use of this filter results in noise amplification. To limit the unbounded 0
nature of the frequency response, a bandlimited filter, caned the Ram-Lak filter
[19] 0
•
(10.41)
has been proposed. In practice, most objects are space-limited and a bandlimiting
filter with a sharp.cutoff frequency ~ 0 is not very suitable, especially in the presence
•
,
• •
of noise. A small value of ~6 gives poor resolution and a very large value leads to
noise amplification. A generalization of (10.41) is the class of filters
H(~) xx: I~IW(~) (10.42)
Here W(s) is a bandlimiting window function that is chosen to give a more-
moderate high-frequency resppnse in order to achieve a better trade-off between
the filter bandwidth (that is, high-frequency response) and noise suppression. Table
10.3 lists several commonly used f~lters. Figure 10.10 shows the frequency and
-.-
0.4 '"
> 0
1f
<C
0.2 -0.1
o I-I-L.....L....; -0.2
-0.6 ·0.2 0 0;2 0.6 . 0 1 2 3 4 5 6
. ,
Frequency. ~ Distance, S
(al RAM·LAK
• 0.2 •
0.6 0.1
~ <'
-
::l
.-
- 0.2
~
0.4 -s'"
-0.1
0
<C •
0 '·0.2
-0.6 -0.2 0 .0.2 0.6 0 1 2 4 3 5 6
Frequency. ~ Oistance. s
•
(b) Shepp·Logan
, f: 0,6 0.2
'"
'C
0.4
-0-'"
.t:
0.2
-"'"
>'"
0.1
E 0
.«
0 -0.1
-0.6 -0.2 0 0.2 0.6 0 1 2 3 4 5 6
Frequency, ~ Distance, s
•
(c) Lowpass cos! ne
0.6 0.2 •
'"
-
'C
'"-0-
••
0.4 e 0.1
E 0.2
->.,::>
<C 0
0 -0.1
-0.6 -0.20 0.2 0.6 0 1 2 4 6
. Frequency, ~ Distance, S
(d) Generalized hamming
• •
Figur/1 10,16· Reconstruction filters. Left column: Frequency response; right col-
umn: Impulse response; dotted lines show linearly interpolated response.
• •l1n tm) A
Hie!
A
Hlk)
•
f--'- Un (mI."
Sample •unlml
--2
K 0 K-l
,
the impulse responses of these filters for d = 1. Since these: functions are real and
even, the impulse responses are displayed on the positive real line only. For low
levels of observation noise, the Shepp-Legan filter is preferred. over the Ram-Lak
filter. The generalized low-pass Hamming window, with the value of IX optimized
for the noise level, is used when the noise is significant. In the presence of noise a
better approach is to use the optimum mean square reconstruction filter 'also called
the stochastic filter, (see Section 10.8).
Once the filter has been selected, a practical reconstruction algorithm has two
•
major steps:
where (jJ N is called the discrete back-projection operator. Because of the back-
projection operation, it is necessary to interpolate the filtered projections
g.(m). This is required even if the reconstructed image is evaluated on a
sampled grid. For example, to evaluate
N-I
f(iAx,jA y) = A 2: g(iAx cosnA + jAy sin nA, nA) (10.47)
"=0
on a grid with spacing (Ax, Ay),i,j =0, ±l, ±2, .. _, we still need to evaluate
g(s, nA) at locations in between the points md, m = - M 12, ... , M /2 - 1. Al-
though higher-order interpolation via the Lagrange functions (see Chapter 4)
is possible, the linear interpolation of (10,45) has been found to give a good
trade-off between resolution and smoothing [18]. A zero-order hold is some-
times used to speed up the back-projection operation for hardware imple-
mentation.
In Fig. 1O.l1b, the filtering operation is performed in the frequency domain accord-
ing to the equation
g(s, 6) = g-ll[G(~,6)H(~)1 . .(10,48)
> •
Given H(~), the filter frequency response, this filter is implemented approximately
by using a sampled approximation of G (~, 6) and substituting a suitable FFr for the
•
inverse Fourier transform. The algorithm is shown in Fig. lO.l1b, which is a one-
dimensional equivalent ofthe algorithm discussed in Section 8.3 (Fig. 8.13b). The
steps of this algorithm are given next:'
1. Extend the sequence g" (m), - MI2':5 m :5 (M 12) - 1 by padding zeros and
periodic repetition to obtain the sequence g" (m)" 0:5 m <. K - 1. Take its
FFT to obtain G; (k), 0 -< k <. K - L The choice of K determines the sampling
resolution in the frequency domain. Typically K = 2M if M is large; for exam-
ple, K = 512 if M = 256. ..
2. Sample H(f;J to obtaiilli(k)~H(kA~),Ii(K-k)~H*(k), O:5k<KI2,
where * denotes the complex conjugate.
3. Multiply the sequences (;n (k) and Ii (k), 0:5 k <. K -1, and take the inverse
.. FFT of the product. A periodic extension' of the result gives in (m),
- K/2:5 m <. (K/2) -1. The reconstructed image is obtained via (10.45) and
(10.46) as before.
Example 10.5
Figure 1O.12b shows a typical projection of an object digitized on a 128 x 128 grid (Fig.
1O.12a). Reconstructions obtained from 90 such projections, each with 256 samples per
line, using the convolution back-projection algorithm with Ram-Lak and Shepp-Legan
filters, are shown in Fig. 10.12c and d, respectively. Intensity plots of the object and its
reconstructions along a horizontal line through its center are shown in Fig. 10.12f
through h. The two reconstructions are almost identical in this (noiseless) case. The
background noise that appears is due to the high-frequency response of the recon-
struction filter and is typical of inverse (or pseudoinverse) filtering. The stochastic filter
• •
outputsshown in Fig. 1O.12e and i show an improvement over this result. This filter is
discussed in Section 10.8.
-
A Unitary Transform· a '
Radon transform theory for random fields can be understood more easily by consid-
ering the operator •
" 30
./
20
, g(s, 01
t" 10
~
lL, !
t'..--
%
y; o
,
i -10
-128 -96 -64 -32 o 32 64 96 128
~ .. .'
~:
,
-,' -
Ii--
t-
tt;
,\
j
•,
J •
~;;;
z'-
If-
• •
Figure 10.12 Image rcconstrucnon
(el Stochastic filter example,
2 2
"" ,
1 f- , 1 •
o ... '-- o
•
-1 -1
, •
,
•
•
3
~1 -1 ,
(10.49)
where Sf112 represents a one-dimensional filter whose frequency response is 1~1112.
The operation
, g (8, 6) ~ (;if = Sf112 9£f = .9'('112 g (10.50)
is equivalent to filtering I12
the projections by !K (Fig.
10.14). This operation can also
be realized by a two-dimensional filter with frequency response (~1 + ~WI4 followed
by the Radon transform.
•
" - .--
-_.-,.,--~
abcdefg
,
hijklmn
0pCIrstu
•
VlVXYZ
.,r
, Ie} original binary image . (ell reconstruction using fully constrained ART
algorithm
Also, let rp (s, 6) be the one-dimensional inverse Fourier transform of Sp(i;, 6), that
•
IS,
(10.55)
-
Theorem 10.2. The operator ill is a whitening transform in e for stationary
random fields, and the autocorrelation function of frs, 6) is given by
rii(s, 6;;', 6'),1. E[g(s, 6)g(s', e~)l =rg(s -s', 6)0(e-6') (1O.56a)
where
ri (s, 6) = r, (s, 8) (lO.56b)
This means the random field g (s, e) defined via (10.50) is stationary in sand
uncorrelated in O. Since g (s, e) can be obtained by passing g(s, 6) through !7('1!2,
which is independent of 6, g(s, 6) itself must be also un correlated in e. Thus, the
Radon transform is also a whitening transform in efor stationary random fields and
the autocorrelation function of g(s, &) must be of the form
. '" ;.
where rg(s, 6) is yet to be specified. Now, for 'any given e, we define the power
spectrum density of g (s, e) as the one-dimensional Fourier transform of its auto-
correlation function with respect to s, that is, .
,
(10.58)
This theorem is noteworthy because it states that the central slice of the two-
dimensional power spectrum density S (~ 1,~2) is equal to the one-dimensional
power spectrum of g (s, 0) and not of g(s, e). On the other hand, the projection
theorem states that the central slice of a two-dimensional amplitude spectrum den-
sity (that is, the Fourier transform) F(g I , ~ 2) is equal to the one-dimensional ampli-
tude spectrum density (that is, the Fourier transform) of g(s, 0) and not of g (s, 0).
.. Combining (10.59) and (10.60), we get
(10.62)
and
(10.66)
Measurement Model ,
•
In the presence of noise, the reconstruction filters listed in Table 10.3 are not
optimal in any sense. Suppose the projections are observed as
•
w(s, 6)= r
_00
hp(s -s', O)g(s', 0) lis , +v(s, 0),
(10.68)
-oo<S <00,0<9:$11'
The function hp (s, 9) represents a shift invariant blur (with respect to s), which may
occur due to the projection-gathering instrumentation, and v(s, 9) is additive, zero
mean noise independent of f(x; y) and uneorrelated in 6 [see (10.64a)]. The opti-
mum linear mean square reconstruction filter can be determined by applying the
Wiener filtering ideas that were discussed in Chapter 8.
The optimum linear mean square estimate off(x,y), denoted by!(x,y), can be
reconstructed from w(s, 0), by the filter/convolution back-projection algorithm
(Problem 10.14)
g(s, 0) = r-<0
ap(s -s', 6)w(s','0) lis '
ap(s, 9) (
(10.70)
hp (s, 9) (
Remarks
•
Note that Ap(~,O) is the one-dimensional Weiner filter for g(s, 6) given w(s, 0).
This means the overall optimum filter AI' is the cascade of-'~l, the filter required for.
the inverse Radon transform, and a window function ~ (~, 6), representing the
locally optimum filter for each projection. In practice, Ap(~, 6) can be estimated
adaptively for each 6 by estimating Sw(~, 6), the power spectrum density of the
observed projection w(s, 6).
Example 10.6 Reconstruction from noisy projections
Suppose the covariance function of the object is modeled by the isotropic function \
T(X, y) = a' exp(-av'x' + y'). The corresponding power spectrum is then S(~ .. ~,) ...
. 21Taa'[a' + 4'lT' (~1 + ~m-312 or Sp (~, 6) '" Z1Taa'[a' + 41T' er"'''.
Assume there is no
blur and let Tv (8, 6) = (T~. Then the frequency response of the optimum reconstruction
filter, henceforth called the stochastic filter, is given by
1~ISp (~,ll) Z'lTlXl:r'I~1
A p
(~, 0) =. S,f(~,ll) + <l.1~i "" 2'lTCxrr 2 + I~I<I. (a' + 41T 2 er
1~12'lTa(SNR) A 0"
=Z'1r<l(SNR) + 1~1«l + 4'lT2 ~ ' ) " " ' . SNR = ~
This filter is independent of 6 and has a frequency response much like that of a .
band-pass filter (Fig. 10. 15a). Figure 10.I5b shows the impulse response of the sto- .
chastic filter used for reconstruction from noisy projections with (T~ =. 5, (J" =. 0.0102,
and a = 0.266. Results of reconstruction are shown in Fig. 1O.15c through i. Com-
parisons with the Shepp-Legan filter indicate significant improvement results from the
use of the stochastic filter. In terms of mean square error, the stochastic filter performs
13.5 dB better than the Shepp-Legan filter in the case of a~ = 5. Even in the noiseless
case (Fig. 10.12) the stochastic filter designed with a high value of SNR (such as 100), .
provides a better reconstruction. This is because the stochastic filter tends to moderate
the high-frequency components of the noise that arise from errors in computation.
0,70
0.60
0.50
0.40
0.30
,
0.20
0.10
•
o 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
•
•
(0) A typical frequencv response of a stochastic filter
Ap(t 8) ;A p(-(, 01
1.0
0.8
.,
0.6
h(s) 0.4
+
0.2
•
o
•
-0.2
•
2
20 I
,I
1 'dl
'\
o Jr
-1
-10 -2 l-.......J.--l.-I.._.L----il-.--l._..l--_L.---J
-128 -96 -64 -32 0 32 64' 96 128 -64 -48 -32 -16 0 16 32 48 64
"
1
-2 l--I._..L---:..l_..L---:..l_..L--I....,-
-64 -48 -32 -16 0 16 32 48 64
,,
,,,
. !!
! .
., " '
-, ,
\
•
,
;;.
'i
. •
• •
•
•
"• ~
,'&"." .... r'i1',· liin ' '.. _~ k'& "w ,f .. ",y '; " .....
(hI Shepp-logan filter, ~ ~ 6; III Stochastic filter, ~ = 5.
Algorithm
There are three stages of this algorithm (Fig.. 10. 16b). First we obtain
Gn(k)=G(k&f"nAe),':'K/2 S kSK/2-1,OSI1.s.N-l, as in Fig. (lO.llb).
Next, the Fourier domain samples available on a polar raster are interpolated to
yield estimates on a rectangular raster (see Section 8.16). In the final stage of the
algorithm, the two-dimensional inverse Fourier transform is approximated by a
suitable-size inverse FFT. Usually, the size of the inverse FFT is taken to be two to
three times that of each dimension of the image. Further, an appropriate window is
. used before inverse transforming in order to minimize the effects of Fourier domain
truncation and sampling.
Although there are. many examples of successful implementation of this
algorithm [29], it has not been as popular as the convolution back-projection algo-
rithm. The primary reason is that the interpolation from polar to raster grid in the
frequency plane is prone to aliasing effects that could yield an inferior reconstructed
image. •
•
•
•
g(s, 01 G(~, 0) fix, Y)
Fill fourier
space
,
~
I
•
o 0 0 0
o 0 0 0
o 0 o 0
o 0 o 0
• •
In magnetic resonance imaging there are two distinct scanning modalities, the
projection geometry and the Fourier geometry [30]. In the projection geometry
mode, the observed signal is G(~,!l), sampled at ~ =k.:i~, --K/2sksKI2-1,
!l = n.:i6, 0 s n s N - 1,.:ie = 'It/N. Reconstruction from such data necessitates the
availability of an FFT processor, regardless of which algorithm is used. For exam-
ple, the filter back-projection algorithm would require inverse Fourier transform of
(10.78)
to estimate g(Sm, en) on the uniform grid by interpolating b(O'm, 13.). Once g(s, 6) is
available on a uniform grid, we can use the foregoing parallel beam reconstruction
algorithms. Another alternative is to derive the divergent beam reconstruction
algorithms directly in terms of b(O", 13) by using (10.77) and (10.78)iri the inverse
Radon transform formulas. (See Problem 10.16,)
In practice, rebinning seems to be preferred because it is simpler and can be
fed into the already developed convolution/filter back-projection algorithms (or.
,
\ ,
Source S
o
Object
processors). However, there are situations where the data volume is so large that the
storage requirements for rebinning assume unmanageable proportions. In such
cases the direct divergent beam reconstruction algorithms would be preferable
because only one projection, beG', /3), would be used at a time, in a manner charac-
teristic of the convolution/filter back-projection algorithms, '
All the, foregoing reconstruction algorithms are based on Radon transform theory
and the projection theorem. It is possible to formulate the reconstruction problem
as a general image restoration problem solvable by techniques discussed in Chapter
8. ,.
which can .be solved for ai.j as a set of linear simultaneous equations via least
•
squares, generalized inverse, or other methods. Once the a/.j are known, I (x, y) is
obtained directly from (10.79).
A particular case of interest is when I(x, y) is digitized on, for instance, an
I x J grid and f is assumed to be constant in each pixel region. Then Qi.i equals Aj,
the sampled value of/(x, y) in the (i,j)th pixel, and
1, inside the (i, j)th pixel region ,
~i.i(X, y) = (10.82)
0, otherwise
Now (10.81) becomes
•
I J
g(Sm, 6n) = 2: 2:fi.jh/. i (sm,On), O:s;m:s;M -1, O:S;n:s;N-1 (10.83)
i=IJ=1
where ,~~ is the pth row of $ and..9' p is the pth element of..9" The algorithm,
originally due to Kaczmarz [30], has iterations that progress cyclically as
Each iteration is such that only one of the P equations (or constraints) is satisfied at
a time. For example, from (10.87) we can see ;
(10.89)
that is, the [(k + 1) modulo P]th constraint is satisfied at (k + l)th iteration. This
algorithm is easy to implement since it operates on one projection sample at a time.
The operation (kk+ I ,Ji ) is equivalent to taking the I(k + 1) modulo P]th pro-
k
jection sample of the previous estimate. The sparse structure • of kk can easily be
exploited to reduce the number of operations at each iteration. The speed of con-
vergence is usually slow, and difficulties arise in deciding when to stop the iter-
ations, Figure 1O.13d shows an image reconstructed using the fully constrained -.
ART algorithm in (to.87). The result shown was obtained after five complete passes
through the image starting from an aU black image.
where xD. [x. y, zr, u~[sine cos o , sine sin o, cos a]" This is also called the
three-dimensional Radon transform of [i», y, z). The Radon transform theory can
be readily generalized to three or higher dimensions. The following theorems
provide algorithms for reconstruction of fex, y, z) from the projections g (s, a).
is the central slice F(~a) iri the direction oof the three-dimensional Fourier trans-
form off(x), that is, .
a (~, a) = F(~a) ~ F(~ sin e cos <j>, ~ sin e sin <j>, ~ cos e) . (10.92)
'.' .'.
..........8
, /"'----v.
,
I
I
\Y
"' ..J
Figure 10.19 Projection geometry for
x • three-dimensional objects.
where
,
(10.93)
,
. - ' . ~.",
where
g (s, a ) = 7-1{C(I:. )el
2
"(
g s, a
)=_ 1
8'!T2
a as 2 '1 <",0: 2J
(10.95)
,
•
'." :
Figure 10.21 Two-stage reconstruction.
,
N=jLKJ. (10.99)
j=O
10.13 SUMMARY
In this chapter we have studied the fundamentals and algorithms for reconstruction
of objects from their projections. the projection theorem is a fundamental result of.
Fourier theory, which leads to useful algorithms for inverting the Radon transform.
•
Among the various algorithms discussed here, the convolution back projection is
the most widely used. Among the various filters, the stochastic reconstruction filter
seems to give the best results, especially for noisy projections. For low levels of
noise, the modified Shepp-Legan filter performs equally well..
The Radon transform theory itself is very useful for filtering and representa-
tion of multidimensional signals. Most of the results discussed here can also be
extended to multidimensions.
PROBLEMS
10.1 Prove the properties of the Radon transform listed in Table 10.1.
10.2 Derive the expression for the Radon transform of the ellipse shown in Figure lO.4b.
10.3 Express the Radon transform of an object fp (r, cl» given in polar coordinates.,
10.4 Find the Radon transform of
a. exp[ -1T(X 2 + y2)]. 'fix, y
j2'll' .] L L •
b. exp [ L (kx +ly) ~ -'2<x,y:5'2 •
Figure PIO.S
10.6 In order to prove (10.19), use the definitions of ul and fA' to show that
Now using the identity (10.9), prove (10.19). Transform to polar coordinates and prove
(10.20).
10.7 Show that the back-projection operator ril is the adjoint of fA'. This means for any
a(x,y) E Itandb(s, 6) E 1J, show that (YI'a, b)" = (a, rAbL.
10.8 Find the Radon transforms of the functions defined in Problem 10:4 by applying the
projection theorem.
10.9 Apply the projection theorem to the function{(x,Y)~f,(x,Y)@f.(x,y)and show
.0',[ Yl'f] = O. (~, 6)0, (I;, 6). From this, prove the convolution-projection theorem.
10.10 Using the formulas (10.26) or (10.27) verify the inverse Radon transforms of the
results obtained in Problem 10.8.
10.11 If an object is space limited by a circle of diameter D and if ~o is the largest spatial
frequency of interest in the polar coordinates of the Fourier domain, show the number
of projections required to avoid aliasing effects due to angular sampling in the trans-
form domain must be N > 'TI'D I; 0 •
10.12'1' Compute the frequency responses of the linearly interpolated digital filter responses
shown in Figure 10.10. Plot and compare these with H(~).
ll2
lO.U a. (Proof o{ Theorem 10.1) Using the fact that Itl is real and symmetric, first show
r - -
thatW·'/2[.'7?·J12 g = (1I2'lT).'7?·v)g, whercw and rl! are defined in (10.32) and
(10.33). Then show that fIr ,',4 = iJJ,'7?'1I2 ~'7?"i2 YI' = ul!Jf),4 = f,4-' YI' =.v (iden-
tity). Now observe that (g,g)" = (g,g)., = (,V?f,i)" and use Problem 10.7 to show
(YI'f, g).. = (f, !lIg}" = (f,f}#·
-
b. (Proof of Theorem 103) From Fig. 10.14 observe that g = !Iff = Mf. Using this
. -
ili~iliat~~ :
(i) E[j (x, y)g (s, I)J ='p
(s - x cos e - y sin e),fp (s, 9) ~ ....ril{I~ISp (~, e)}
(ii) E[g (s, 6)g(s', e)J = [ltlSp (~, 0) exp(j2'1T~s')a(s. e;~, e')d~
,
where «(s, 9; 1;, 0') is the Radon transform of the plane wave
exp{- j2'lTt(x cos 0' + y sin ll')} and equals exp( - j2'lTI;s)o(9 -ll')/Iti. Simplify
this and obtain (10.56a). Combine (1O.56b) and (10.58) to prove (10.60).
10.14 Write the observ~~~n model of (10.68) as vex, y).~ [iI'-1 w= hex, Y)~f(x,!,) + 1)(x, y)
where h(x,y). • H(~h~2)=Hp(~,9)' • h,,(s, e) and 1) = a v, whose
. Problems" Chap.
, 10 471
•
•
power spectrum density is given by (10.67). Show that the frequency response of the
two-dimensional Wiener filter for [(x, y), written in polar coordinates, is A (~, 6) =
H;; 51'llHp 1 51' +. ~iS;;·'r'. Implement this filter in the Radon transform domain. as
2
. . [iJb(a,p)_iJb(a,MJ
. 1 r2"I~ iJO' iJ13
fp (r, lj» = 4-rr2 Jo _, r cos(a + 13 -<I» _ R sin 0' dad 13
where
a'-a [1 <3b(a,l3) liJb(a,13)].
ljJ(a,/3,(J"')~ sin(q' -0") p iJa - p 8/3 '
.0, . . jaj>-y
•
•
, .l
a = tan
-I r cos(13 - <l»
R +r SIn. (I>.
1'-",
A.) , p ~ {[r cos{13 -<I>lY + [R + r sin(13 _lj»]'}'f2 >
•
°
•
Show that a' and p correspond to a ray (a' , \3) that goes through the object at
location ir, lj» and p is the distance between the source and (r, 4». The inner
integral in the above Radon inversion formula is the Hilbert transform of ljJ(a, " -)
and the outer integral is analogous to back projection.
c. Develop a practical reconstruction algorithm by replacing the Hilbert transform by
a bandlimited filter, as in the caseof parallel beam geometry.
10.17 (Two-stage reconstruction in three dimensions)
a. Referring to Fig. 10.19, rotate the x- and y-axes by an angle <1>, that is, let
x' = x cos 4> + y sin lj>, v' = -x sin lj> + y cos 4>, and obtain
" .
.b. Develop the block diagram for a digital implementation. of the two-stage
reconstruction algorithm .
•
Section 10.1 •
For image formation models of CT, PET, MRI and overview of computerized
tomography: ,
1. IEEE Trans. Nucl. Sci. Special Issue on topics related to image reconstruction. NS-21,
no. 3 (1974); NS-26, no, 2 (April 1979); NS-27, no. 3 (June 1980).
,
•
2. IEEE Trans. Biomed. Engineering. Special Issue on computerized medical imaging.
BME-28, no. 2 (February 1981). '"
.: 3. Proc. IEEE. Special Issue on Computerized Tomography.n, no. 3 (March 1983).
,'4. A. C. Kak. "Image Reconstruction from Projections," in M. P. Ekstrom (ed.). Digital
Image Processing Techniques. New York: Academic Press, 1984, pp. 111-171. •
Literature on image reconstruction also appears in other journals such as: J. Com-
o put. Asst. .Tomo., Science, Brit. J. Radiol., J. Magn. .Reson. Medicine, Comput.
Biol. Med., and Medical Physics. .
Sec.tions 10.2-10.5
12. J. Radon. BOber die Bestimrnung von Funktionen durch ihreIntegralwerte Tangs
gewisser Mannigfaltigkeiten" (On the determination of functions from their integrals
along certain. manifolds). Bertichte Saechsiche Akad. Wissenschatten.Il.eipzig), Math.
fh)'i Klass 69, (1917): 262-277. .
13. D. Ludwig. "The Radon Transform on Euclidean Space." Commun. Pure Appl. Math.
. .19, (1966): 49-81. 0 • 0 •
14. D. E. Kuhl and R. Q. Edwards. "Image Separation Isotope Scanning," Radiology 80,
no. 4 (1963): 653-662.
•
15. P. F.• C. Gilbert. "The Reconstruction of 'a Three-Dimensional •
Structure from
Projections and its Applicationto Electron Microscopy: II. Direct Methods." Proc. Roy.
. Soc. London Ser. B, vol. 182, (1972): 89-102. .
16. P. R. Smith, T. M. Peters, and R. H. T. Bates. "Image Reconstruction from Finite
Number of Projections." J. Phys. A: Math Nucl. Gen. 6, (1973): 361-382. Also see New
Zealand J. Sci., 14, (1971): 883-896.
17. S. R. Deans. The Radon Transform and Some of Its Applications. New York: Wiley,
1983.
Section 10.6
Sections 10.7-10.8
22. A. K. Jain and S. Ansari. "Radon Transform Theory for Random Fields and Image
Reconstruction From Noisy Projections." Proc. ICASSP, S2'1 Diego,.1984.
23. A. K. Jain. "Digital Image Processing: Problems and Methods," in T. Kailath (ed.),
Modern Signal Processing. Washington: Hemisphere Publishing Corp., 1985.
For reconstruction from noisy projections see the above references and:
. Section 10.9
•
Sections 10.10-10.13
For fan-beam reconstruction theory, see [6, 7] and Horn in [1(iii), pp. 1616-1623].
For algebraic techniques and ART algorithms, see [5, 6] 'and: .
36. M. Bernfield. "CHIRP Doppler Radar." Proc, IEEE, vol. 72, no. 4 (April 1984):
540-541.
37. J. Raviv, J. F. Greenleaf, and G. T. Herman (eds.), Computer Aided Tomography and
Ultrasonics in Medicine. Amsterdam: North-Holland, 1979.
•
,
•
• •
, '
••
•
•
11.1 INTRODUCTION
Image data compression is concerned with minimizing the number of bits required
to represent an image. Perhaps the simplest and most dramatic form of data
compression is the sampling of bandlimited images, where an infinite number of
pixels per unit area is reduced to one sample without any loss of information
(assuming an ideal low-pass filter is available). Consequently, the number of sam-
ples per unit area is infinitely reduced. .
Applications of data compression are primarily in transmission and storage of
information. Image transmission applications are in broadcast television, remote
sensing via satellite, military communications via aircraft, radar and sonar, tele-
conferencing, computer communications, facsimile transmission, and the like.
Image storage is required for educational and business documents, medical images
that arise in computer tomography (CT), magnetic resonance imaging (MRI) and
digital radiology, motion pictures, satellite images, weather maps, geological sur-
veys, and so on. Application of data compression is also possible in the development
of fast algorithms where the number of operations required to implement an algo-
rithm is reduced by working with the compressed data.
476
Data Compression versus Bandwidth Compression
The mere process of converting an analog video signal into a digital signal results in
. increased bandwidth requirements for transmission. For example; a 4-MHz tele-
vision signal sampled at Nyquist rate with 8 bits per sample would require a band-
width of32 MHz when transmitted using a digital modulation scheme, such as phase
shift keying (PSK), which requires 1 Hz per 2 bits. Thus, although digitized informa-
tion has advantages over its analog form in terms of processing flexibility, random
access in storage, higher signal to noise ratio for transmission with the possibility of
errrorless communication, and so on, one has to pay the price in terms of this
eightfold increase in bandwidth. Data compression techniques seek to minimize this
cost and sometimes try to reduce the bandwidth of the digital signal below its analog
bandwidth requirements.
Image data compression methods fall into two common categories. In the first
category, called predictive coding, are methods that exploit redundancy in the data.
a
Redundancy is characteristic related to such factors as predictability, randomness,
and smoothness in the data, For example, an image of constant gray levels is fully
predictable once the gray level of the first pixel is known. On the other hand, a
white noise random field is totally unpredictable and every pixel has to be stored to
reproduce the image. Techniques, such as delta modulation and differential pulse
code modulation fall into this category. In the second category, called transform
coding, compression is achieved by transforming the given image into another array
such that a large amount of information is packed into a small number of samples.
Other image data compression algorithms exist that are generalizations or com-
binations of these two methods. The compression process inevitably results in some
distortion because of accompanying A to D conversion as well as rejection of some
relatively insignificant information. Efficient compression techniques tend to min-
imize this distortion. For digitized data, distortionless compression techiques are
possible, Figure 1l.1 gives a summary classification of various data compression
techniques.
Information Rates
Raw image data rate does not necessarily represent its average information rate,
which for a source with L possible independent symbols with probabilities Pi,
Image data compression techniques
,
••
,
•
.
• . •
.
•
[ Pixel coding ] Predictive coding , ITransform coding J Other methods
. ,.
.-,_""".' *:Interframe techniques
• Adaptive
techniques
* Adaptive
.. Color image coding
•
.. Vector quantization
, .. Miscellaneous
(11.3)
Coder I Decoder
• I
I
4 X4 image block Calculate
mean u and
fJ,(J ! (Jul + fJ
u'(m, nl
ulm, nl std. dev, (J I
" I •
•
I ,
•
•
Prototype "
~ Read
patterns, ul ul(m, nl
I
1=0, 1, .. "L - t I •
fJ,(J I
ill I I
I
•
U-fJ
A
u(m, nl Match ,
. I •
•
o pattern
i
I
I .
In these techniques each pixel is processed independently, ignoring the inter pixel
dependencies. , . ' .
•
pcrv'!
.. ,
In:PCM the incoming video signal is sampled, quantized, and coded by a suitable
code word (before feeding it to a digital 'modulator for transmission) (Fig. 11.3).
The quantizer output is generally coded by a fixed-length binary code word having
B bits. Commonly, 8 bits are sufficient for monochrome broadcast or video-'
conferencing quality images, whereas medical images or color video signals may
require 10 to 12 bits per pixel. .
The number of quantizing bits needed for visual display of images can be
reduced to 4 to 8 bits per pixel by using companding, contrast quantization, or
dithering techniques discussed in Chapter 4. Halftone techniques reduce the
quantizer output to 1 bit per pixel, but usually the input sampling rate must be
increased by a factor of 2 to 16. The compression achieved by these techniques is
generally less than 2 : 1. .
In terms of a mean square distortion, the minimum achievable rate by PCM is
given by the rate-distortion formula
. cr 2
R pCM =! log, ;, cr~ < cr~ (11.4)
cr q
where cr~ is the variance of the quantizer input and cr~ is the quantizer mean square
distortion.
Entropy Coding
If the quantized pixels are not uniformly distributed, then their entropy will be less
than B, and there exists a code that uses less than B bits per pixel. In entropy coding
the goal is to encode a block of M pixels containing MB bits with probabilities
pi, i = 0,1, ... ,L - 1, L = 2MB, by -logzPI bits, so that the average bit rate is
This gives a variable-length code for each block, where highly probable blocks (or
symbols) are represented by small-length codes, and vice versa. If -!ogzp, is not an
integer, the achieved rate exceeds H but approaches it asymptotically with in-
creasing block size. For a given block size, a technique called Huffman coding is the.
most efficient fixed to variable length encoding method.
•
2. While there is more than one node:
o Merge the two nodes with smallest probability to form a new node whose
probability is the sum of the two merged nodes.
e Arbitrarily assign 1 and 0 to each pair of branches merging into a node.
3. Read sequentially from the root node to the leaf node where the symbol is
• located.
The preceding algorithm gives the Huffman code book for any given set of
probabilities. Coding and decoding is done simply by looking up values in a table.
Since the code words have variable length, a buffer is needed if, as is usually the
case, information is to be transmitted over a constant-rate channel. The size of the
code book is L and the longest code word can have as many as L bits. These
parameters become prohibitively large as L increases. A practical version of Huff-
man code is called the truncated Huffman code. Here, for a suitably selected L 1 < L,
the first L 1 symbols are Huffman coded and the remaining symbols are Coded by a
prefix code followed by a suitable fixed-length code. .
• Another alternative is called the modified Huffman code, where the integer i is
represented as ,
.. (L -1)
i = qL1 + [. o<: q :S Int L ' 0 -s j s L 1 - 1
1 •
(11.5)
•
The first L 1 symbols are Huffman coded. The remaining symbols are coded by a
prefix code representing the quotient q, followed by a terminator code, which is the
same as the Huffman code for the remainder j, 0 S j S L 1 - L
• •
The long-term histogram for television images is approximately uniform, al-
though the short-term statistics are highly nonstationary. Consequently entropy
coding is not very practical for raw image data. However, it is quite useful in
predictive and transform coding algorithms and also for coding of binary data such
as graphics and facsimile images.
Example 11.1
Figure 11.4 shows an example of the tree structure and the Huffman codes. The
algorithm gives code words that can be uniquely decoded. This is because no code word
can be a prefix of any larger-length code word. For example, if the Huffman coded bit
stream is
o 0 0 1 0 1 1 0 1 0 1 1 ...
•
then the symbol sequence is SOS2S,$3' • , , A prefix code (circled elements) is obtained
by reading the code of the leaves that lead up to the first node that serves as a root for
the truncated symbols. In this example there are two prefix codes (Fig. 11.4). For the
.". truncated Huffman code the symbols s., .... , $7 'are coded by a 2-bit binary code word.
This code happens to be less efficient than the simple fixed length binary code in this
example. But this is not typical of the truncated Huffman code.
Run-Length
•
CodilJ9A'
•
-.'
. Consider a binary source whose output is coded as the number of as between two
successive Is, that is, the length of the runs of as are coded. This is called run-length
, Truncated Modified 1
Huffman
I
,
• Huffman Huffm.n
Binary
$vmb<>l
code
PI ,
,
code
{He} COde, L, "2 code. L, =4 I
{THC} IMHCI I
\
0 0, 54 0 P 00
.
o O' 00
I
$0 000 0.25 1.0
\,
$, 001 0.21
(I
;
/ .0
0 0.46 1 0 1 0 1 0
,
,
\
1 x' y
, •
~ 0' 0
,
'2 o10 0.15 0 THe 010 (110
•
::x
0.29 Q) ,
,
.
, '2 o1 1 0.14 1 1
Root nodes
oI 1 o1 1 0 o1 1
•
~OO [i2]OO
symbols
'. 100 0.0625 .. 0
, j!;125
1 1 0 0
0 MHC
0.25 .
=s 101 0.0625'- 1 1 1 0 1 1 1 0 1 1 1 1 0 •
. .
1 " •
'6 1 10 O.0625~ (I 1 1. 1 0 1 1 1 0 1 101 0
•
.
,
0.125
,
. '7 1 1 1 0.0625" 1
•
1 1 1 1 ,
1 1 1 1 1 10 1 1
•
.
Average
code 3.0 2.781 (entropy) 2.79 3.06 2,915
len!/lh
•
Code
efficiency 92,7% 99.7% 90,3% 95,4%
HIS.
•
•
Figure 11.4 Huffman coding example. r, x', x" = prefix codes, y = fixed length
code. z = terminator code. In general z, x'; x" can be different.
•
coding (RLC), It is useful whenever large runs of Os are expected. Such a situation
occurs in printed documents, graphics, weather maps, and so on, where p, the
probability of a 0 (representing a white pixel) is close to unity. (See- Section 11.9.)
Suppose the runs are coded in maximum lengths of M and, for simplicity, let
M = 2'" - 1. Then it will take m bits to code each runby a fixed-length code. If the
successive Os occur independently, then the probability distribution of the run
lengths turns out to be the geometric distribution .
.
I ) = pl(1- p), 0 $[ $ M - 1
g ( p", l=M (11.6)
Basic Principle \
The coding process continues recursively in this manner, This method is .ealled
differential pulse code modulation (DPCM) or differential PCM. Figure 11.5 shows
•
Sec. 11.3, Predictive Techniques 483
r---- - - --.---------!
< J <
i
u(n) + etnj e' Inl Communication 'f(n) I + I /J'(nl
:l: Ouantizer ,, :l:
-
channel
I ;, I
<
+ u'(n) I
•
• , , ,, J
•
inn) I I
<
u'(n) + ,
<
I <
Predictor
I
• Predictor I
<
I
, <
<
with delays k ,, with delays I
<
r- + I I
• I I
• L ________________ --l
Reconstruction filter
Coder Decoder
the DPCM codec (coder-decoder). Note that the coder has tocalculate the re-
produced sequence u' (n).The decoder is 'simply the predictor loop of the coder.
Rewriting (11.10) as
u(n) = u' (n) + e(n) (11.12)
and subtracting (11.11) from (11.12), we obtain
(Ju(n) ~ u(n) - u: (n) = e(n) -if (n) = q(n) (11.13)
,
Thus, the pointwise coding error in the input sequence is exactly equal to q(n), the
quantization error in e(n). With a reasonable predictor the mean square value of the
differential signal e(n) is much smaller than that of u(n). This means, for the same
mean square quantization error, e(n) requires fewer quantization bits than u(n).
•
Feedback Versus Feedforward Prediction
<
An important aspect of DPCM is (11.9) which says the prediction is based on the
output-the quantized samples-rather than the input-the unquantized samples.
This results in the predictor being in the feedback loop around the quantizer, so that
the quantizer error at a given step is fed back to the quantizer input at the next step.
This has a stabilizing effect that prevents de drift and accumulation of error in the
reconstructed signal ic (n)..
On the other hand, if the prediction rule is based on the past inputs (Fig.
11.6a), the signal reconstruction error would depend on all the past and present
<'r nl
<
Quantizer
Entropy
b
+
•
. coder/decoder
tIn) •
u(nl
<
Predictor
• < ,,
<
, [predictor ]
• <
<
<
.
1
I I I I I
-6 -4 -2 2 4 6
-1
•
•
.
,
n u(n) u'(n) e(n) e' (n) £((n) 8u(n) u(n) sen) a' (n) . u' (n) lIu (n)
.
1
0 100
102
-
100 2
- -
1
100
101
0
1,
-
100
-2 -
1
100
101
0
1
Edge-;. 2 120 101 19 5 106 14 102 18 5 106 14
3 120 106 14 5 111 9 120 0 -1 105 15
4 120 111 . 9 5 116 4 120 0 . -1 101 16
5 118 116 2 1 117 1 120 -2 .-5 . 99 19
In digital processing the input sequence u(n) is generally digitized at the source
itself by a sufficient number of bits (typically 8 for images). Then, u(n) may be
considered as an integer sequence. By requiring the predictor outputs to be integer
values, the prediction error sequence will also take integer values and can be
entropy coded without distortion. This gives a distortionless predictive codec (Fig.
. 11.6b), whose minimum achievable rate would be equal to the entropy of the
prediction-error sequence e(n).
•
Performance Analysis of DPCM
Denoting the mean square values of quantization error q(n) and the prediction
error e(n) by lT~ and O"~, respectively, and noting that (11.13) implies
E[(Su(n»:] = lT~ (11.14)
•
the minimum achievable rate by DPCM is given by the rate-distortion formula {see
(2.116)]
RDPC.Il =i
I IT.
log, 2"
2
bits/pixel (11.15)
lT q
- In deducing this relationship, we have used the fact that common zero memory
quantizers (for arbitrary distributions) do not achieve a rate lower than the Shannon
quantizer for Gaussian distributions (see Section 4.9), For the same distortion
lT~:S; IT;, the reduction in DPCM rate compared to PCM is [see (11.4)] .
If the feed forward prediction error E(n) has variance (32, then
(32 <'O'~ (11.18)
This is true because tr (n) is based on the quantization noise containing
samples {u' (m), m <: n] and could never be better than 12 (n). As the number of
quantization levels is increased to infinity, (T~ will approach f32, Hence, a lower
bound on the rate achievable by DPCM is •
'
(11.19)
When the quantization error is small, R DPCM approaches R min • This expression is
usefulbecauseit is much easier to evaluate 13 than 0'; in (ILl5), and it can be used
2
which is proportional to the log of the variance ratio (0'~/f$2). Using (11.16) we note
that the increase in SNR is approximately 6 (R p CM - R DPCM ) dB, that is, 6 dB per bit
of available compression.
From these measures we see that the performance of predictive coders
depends on the design ofthe predictor and the quantizer. For simplicity, the predic-
tor is designed without considering the quantizer effects. This means the prediction
rule deemed optimum for Ii (n) is applied to estimate u· (n). For example, if it (n) is
given by (11.17) then the DPCM predictor is designed as
In two (or higher) dimensions this approach requires finding the' optimum causal
prediction rules. Under the mean square criterion the minimum variance causal
representations can be used directly. Note that the DPCM coder remains nonlinear
even with the linear predictor of (11.22). However, the decoder will now be a linear
filter, The quantizer is generally designed using the statistical properties of the
innovations sequence E(n), which can be estimated from the predictor design.
Figure 11.8 shows a typical prediction error signal histogram. Note the prediction
• •
0.50
••
0.40 I," ,
•
•
, , .', _.-
,. • . - . \ • •
0.30
•
• •
0.20 •••
. . :l~';:);
•
, .
,.
• •
0.10
. . 0.00 I
• •
•
. -64-56 -48 -40 -32 -24 -16 - 8 0 8 1624 32 4048' 66. 64
•
"
., . , .\-":' ,
• • •
Figure 1l.8 Predictions - error histogram.
, where 13 2 is its variance. The quantizer is generally chosen to be either the Lloyd-
Max (for a constant bit rate at the output) or the optimum uniform quantizer
(followed by an entropy coder to minimize the average rate). Practical predictive
codecs differ with respect to realizations and the choices of, predictors and
quantizers. Some of the common classes of predictive codecs for images are de-
scribed next.
•
Delta Modulation
•
Delta modulation (DM) is the, simplest of the predictive coders. It uses a one-step
delay function as a predictor and a J-bit quantizer, giving a l-bit representation of
the signal. Thus
U'(n)=u'(n -1), e(n) =u (n) - U· (n - 1) (11.24)
A practical DM system, which does not require sampling, of the input signal, is
shown in Fig. 11.9a. The predictor integrates the quantizer output, which is a
sequence of binary pulses. The receiver is a simple integrator. Figure lL9b shows
typical input-output signals of a delta modulator. Primary limitations of delta
modulation are (1) slope overload, (2) granularity noise, and .(3) instability to
channel errors. Slope overload occurs whenever there is a large jump or discon-
tinuity in the signal, to which the quantizer can respond only in several delta steps.
Granularity noise is the steplike nature of the output when the input signal is almost
constant. Figure l1.lOb shows the blurring effect of slope overload near the edges
and the granularity effect in the constant gray-level background.
Both of these errors can be compensated to a certain extent by low-pass
filtering the input and output signals. Slope overload can. also be reduced by in-
creasing the sampling rate, which will reduce the interpixel differences. However,
the higher sampling rate will tend to lower the achievable compression. An alterna-
tive for reducing granularity while retaining simplicity is to go to a tristate delta
modulator. The advantage is that a large number (65 to 85%) of pixels are found to
be in the .level, or 0, state, whereas the remaining pixels are in the + 1 states.
Huffman coding the three states or run-length coding the 0 states with a 2-bit code
for the other states yields rates around 1 bit per pixel for different images [14].
The reconstruction filter, which is a simple integrator, is unstable. Therefore,
in the presence of channel errors, the receiver output can accumulate large errors. It
can be stabilized by attenuating the predictor output by a positive constant o < 1
(called leak). This will, however, not retain the simple realization of Fig. 11.9a.
For delta modulation of images, the signal is generally preserited line by line
and no advantage is taken or the two-dimensional correlation in. the data. When
each, scan line of the image is 'represented by a first-order AR process (after
subtracting the mean),
•
I
u(t) . +"......,.
, eitl !
,,'Id e'(nl
Input --I"': Low-pass , ,
signa1 filter
q
Integrator
,
, I u'it) , 1
)
J
,,
•
i
•
\ I
,
Integrator
! e'(nj u'(nl
Low-pass ! u'Ul
., .
Channel
I .
I , filter
I j I Receiver
ial A practical system
Granularity
t \
-......,...,. =t:R::::t--
Slope
overload
t ...
the SNR of the reconstructed signal is given, approximately, by (see Problem 11.4)
, "
PCM at 1 bit per pixel. This amounts to a compression of 2,5, or a savings of about
· 1.5 bits per pixel. Equations (11.25) and (H.26) indicate the SNR of delta modu-
)- ,!
L
'0- _
r
I
,
I, :1# f
..
't'
'i
-1
,
"'!
)_. _,_,,","n;;l~
(e) lina·by-line OPCM, 3 bits/pixel; (dl two-dlmenslcnal OPCM. 3 bits/pixel.
• •
Line~bv·line DPCM
In this method each scan line of the image is coded independently by the DPCM
technique. Generally, a suitable AR representation is used for designing the pre-
,
dieter. Thus if we have apth-order, stationary All sequence (see Section 6,2)
p
For the first-order AR model of (11.25), the SNR of a B-bit DPCM system output
can be estimated as (Problem 11,6)
. (1- p2f(B»
(SNR)DPCM = 10 IOglO (1- p2)f(B) dB (11.29)
,
Forp = 0.95 and a Laplacian density-based quantizer, roughly 8-dB to lO-dB SNR
improvement over PCM can be expected at rates of 1 to 3 bits per pixel.
Alternatively, for small distortion levels (f(B) "" 0), the rate reduction over PCM is
. [see (11.16») .
, This means, for example, the SNR of 6-bit PCM can be achieved by 4-bit line-by-
line DPCM for p = 0.97. Figure l1.lOc shows a line-by-line DPCM coded image at 3
bits per pixel.
Two-Dimensional DPCM
The foregoing ideas can be extended to two dimensions by using the causal MVRs
discussed in chapter 6 (Section 6.6), which define a predictor of the form
,
m
,
•
8 C D
e €I €I , '
A
e •
Here QI , Q2 , QJ , a4, and f32 are obtained by solving the linear equations
r(l,O) = Qlr(O,O) + Q2r(1, -1) + QJr{O, 1) +a,r(O, 1)
, . , '
where rtk, I) is the covariance function of u(m, n), In the special case of the
separable covariance function of (2.84), we obtain
•
(11.34) ,
(32 == (.12 (1- p1)(l - P1) •
•
Recall from Chapter 6 that unlike the one-dimension case, this solution of
(11.33) can give rise to an unstable causal model. This means while the prediction
error variance will be minimized (ignoring the quantization effects), the recon-
struction filter could be unstable causing any channel error to be amplified greatly at
the receiver. Therefore, the predictor has to be tested for stability and, if not stable,
it has to be modified (at the cost of either increasing the prediction error
, variance or
increasing the predictor order). Fortunately, for common monochrome image data
(such as television images), this problem is rarely encountered.
Given the predictor as just described, the equations for a two-dimensional
DPCM system become
Predictor: tr (m, 71)= al u' (m -1, n) + a2 u' (m, It - 1)
(11.35a)
, + a; u' em - 1, n - 1) + a4 u: (m -1, n + I)
•
'Quantizerinput: e(m,n)=u(m,n)-u'(m,n) (11.35b)
"
Reconstruction filter: u: (m, 71) = u' (m, 71) + e' (m, n) (11.35c)
• • •
The performance bounds of thismethod can be evaluated via (11.19) and (11.20).
An example, of a two-dimensional DPCM coding at 3 bits per pixel is shown in Fig.
1l.10d.
,
30
t 25-
- Gauss-Markov p "': 0.95
-
'" 20 i-
-a:
-c
I""
,
o I •
1 2 3 4
Rata (bit.!sam;:lle) •
Figure 11.12 Performance of predictive codes.
'Performance Comparisons
Figure 11.12 shows the theoretical SNR versus bit rate of two-dimensional DPCM
of images' modeled by (11.34) and (11.35) with a, = O. Comparison with one-
dimensional line-by-line DPCM and rCM is also shown. Note that delta modu-
lation is the same as J-bit DPCM in these curves. In practice, two-dimensional
DPCM does not achieve quite as much as a 20-dB improvement over POI/I, as
expected for random fields with parameters of (11.34). This is because the two-
dimensional separable covariance model is overly optimistic about the variance of
the prediction error. Figure 11.13 compares the coding-error images in one- and
r
I
r !
I
I
!,
,
-.
,', , , " " j'" f' " "
-.
- --
Ii,
-.,. '.-
,, . I' ,,'
~ '.
"".",,,, up; ,,,,,,,,,,,,.J;I..;'.,,. m\f\Mi44t\&L &j&i/.,n'"""",,,,,,""jd""~" .£. Ad.
,
.. oJ
j~
I. l.
\
rvi: f ,t
,1." \
,
b-,'>-/'
"," !
~ ., , ,
,l_
I ..,
-f;
., ,
,
'-' .. ,
':t,'
\,..0." i
• 'J,
" ,
- '4
"
6~' ,', ,"" , I
,
""~""
-. ~.
. •
(a) cne-dlrnenslcnal (b) two-dimensional
•
Figure 11.13 One- and two-dimensional PPCM images coded at 1 bit (upper
images) and 3 bits (lower images) and their errors in reproduction..
Remarks
Strictly speaking, the predictors used in DPCM are for zero mean data (that is, the
dc value is zero). Otherwise, for a constant background fJ., the predicted value
u' (m, n) '= (al + a~ + aJ + a4)fL (11.36)
would yield a bias of (1 - at - al - aJ - 'l4)11, which would be zero only if the sum of
the predictor coefficients is unity. Theoretically, this will yield an unstable rccon~
struction filter (e.g., in delta modulation with no leak). This bias can be minimized
by (1) choosing the predictors coefficients whose sum is close to but less than unity,
(2) designing the quantizer reconstruction level to be zero for inputs near zero, and
(3) tracking the mean of the quantizer output and feeding the bias correction to the
predictor.. .
The quantizer should be designed to limit the three. types of degradations,
granularity, slope overload, and edge-busyness. Coarsely placed inner levels of the
quantizer cause granularity in the flat regions of the image. Slope overload occurs at
high-contrast edges where the prediction error exceeds the extreme levels of the
quantizer, resulting in blurred edges. Edge-busyness is caused at less sharp edges,
where the reproduced pixels on adjacent scan lines have different quantization
levels. In the region of edges the optimum mean square quantizer based on Lapla-
cian density for the prediction error sequence turns out to be too companded; that
is, the inner quantization. steps are too small, whereas the outer levels are too
coarse, resulting in edge-busyness. A solution for minimizing these effects is to
increase the number of quantizer levels and use an entropy coder for its outputs.
This increases the dynamic .range and the resolution of the quantizer. The average
coder rate will now depew:. on the relative occurrences of the edges. Another
alternative is to incorporate visual properties in the quantizer design using the
visibility function [18]. In practice, standard quantizers are optimized iteratively to .
achieve appropriate subjective picture quality.
In hardware implementations of two-dimensional DPCM, the predictor is
often simplified to minimize the number of multiplications per step. With reference .
to Fig. 11.11, some simplified prediction rules are discussed in Table 11.2,
The choice of prediction rule is also influenced by the response of the recon-
struction filter to channel errors. See Section 1L8 for details.
. For interlaced image frames, the foregoing design principles are applied to
each field rather than each frame. This is because successive fields are 6li s apart and
the intrafield correlations are expected to be higher (in the presence of motion) than
the pixel correlations in the de-interlaced adjacent lines.
Overall, DPCM is simple and well suited for real-time (video rate) hardware
implementation. The major drawbacks are its sensitivity to variations in image
statistics and to channel errors. Adaptive techniques can be used to improve the
compression performance of DPCM.· (Channel-error effects are discussed in
Section 11.8.) .
•
Adaptive Techniques
The performance of DPCM can be improved by adapting the quantizer and pre-
dictor characteristics to variations in the local statistics of the image data. Adaptive
techniques use a range of quantizing characteristics and/or predictors from which a
"current optimum" is selected according to local image properties. To eliminate the
overhead due to the adaptation procedure, previously coded pixels are used to
determine the mode of operation of the adaptive coder. In the absence of trans-
mission errors, this allows the receiver to follow the same sequence of decisions
made at the transmitter. Adaptive predictors are generally designed to improve the
subjective image quality, especially at the edges. A popular technique is to use
several predictors, each of which performs well if the image is highly correlated in a
certain direction. The direction of maximum correlation is computed from previ-
ously coded pixels and the corresponding predictor is chosen.
Adaptive quantization schemes are based on two approaches, as discussed
'next.
•
Adaptive gain control. For a fixed predictor, the variance of the prediction
error will fluctuate with changes in spatial details of the image. A simple adaptive
quantizer updates the variance of the prediction error at each step and adjusts the
spacing of the quantizer levels accordingly. This can be done by normalizing the
prediction error by its updated standard deviation and designing the quantizer levels
for unit variance inputs (Fig. 11.14a).
Let (T; (j) and &; (j) denote the variances of the quantizer input and output,
respectively, at step j of a DPCM loop. (For a two-dimensional system, this means
we are mapping (m, n) into j.) Since « (j) is available at the transmitter as well as
Act'1\'.
e r , e e' •
e
"
code
~ Quantizer X , Activity
,. measure Ouantizer .....
,
, ,
IJ
,
0.
Gain
}; " "00 QUantizer
estimator •
• • • ·
• , •
&;
the receiver, it is easy to estimate (j). A simple estimate, called the exponential
average variance estimator, is of the form
0-; (j + 1) = (1 - '1) [e' (n]" + '1&; (j), &; (0) = (e' (0»2, j = 0, 1,.. . (11.37)
where ° For small quantization errors, we may use eYe (j) as an estimate of
<: '1:51.
0', (j). For Lloyd-Max quantizers, since the variance of the input equals the sum of
the variances of the output and the quantization error [see (4.47)}, we can obtain the
recursion for a;(j) as
(11.38)
(11.39)
where 'Y is a constant determined experimentally so that the mean square error is
minimized. The above two estimates become poor at low rates, for example, when
B == 1. An alternative, originally suggested for adaptive delta modulation [7], is to
define a gain a, = gem, n), which is recursively updated as
(11.40)
where M (Iq;i) is a multiplier factor that depends on the quantizer levels qj and Clk.1
are weights which sum up to unity. Often OI:k.1 = liN"" where N", is the number of
pixels in the causal window W. For example (see Table 11.1), for a three-level
quantizer (L = 3) using the predictor neighbors of Fig. 11.11 and the gain-control
formula
gem, n) = ![g(m -I, n)M(lq"'-l.n I) + gem, n - l)M(!qm.n-l D] (11.41)
,
the multiplier factor M(iql) takes the values M(O) '" 0.7,M(:tqd = 1.7. The values
in Table 11.1 are based on experimental studies [19] on S-bit images.
Adaptive clas~ification.
Adaptive classification schemes segment the
image into different regions according to spatial detail, or activity, and different
quantizer characteristics' are used for each activity class (Fig. 11.14b). A simple •
3 5 5S 0.7 1.7
5 5 40 0.8 1.0 2.6
7 4 32 0.6 1.0 , i.s 4.0
' .
•
• ,
•
•
measure'ofactivity is the variance of the pixels in the neighborhood of the pixel to
be predicted. The flat regions are- quantized more finely than edges or detailed
areas. This scheme takes advantage of the fact that noise visibility decreases with
- increased activity, Typically, up to four activity classes are sufficient. An example
would be to divide the image into 16 x,16 blocks and classify each block into one of
four classes. This requires only a small overhead 'of 2 bits per block of 256 pixels.
.
Sec. 11.3 Predictive Techniques 497
tizer may have, significant correlation, and the predictor may not be good enough.
Two methods that can improve the performance are
In the first method [17], a tree code is generated by the prediction filter excited by
different quantization levels. As successive pixels are coded, the predictor selects a
path in the tree (rather than a branch value, as in DPCM) such that the mean square
error is minimized. Delays are introduced in the predictor to enable development of
a tree with sufficient look-ahead paths.
In the second method [20]. the successive inputs to the quantizer are entered
in a shift register, whose state is used to define the quantizer output value. Thus the
quantizer current output depends on its previous outputs. .
. I
,
ulll •
. I'm 1"111
, ,
u ""' , 1'= =1' = II
•
ui21 vl21 v'i21
I.inear Linear
transform vikl v'lkl transform
• A B •
• •
• •
• •
• •,
•
, .
2. The Lloyd-Max quantizer for each v(k) minimizes the overall mean square
, ,
error grvmg
.
r=I . (11.45)
•
3. The optimal decorrelating matrix A is the KL transform of u, that is, the rows
of A are the orthonormalized eigenvectors of the autocovariance matrix R.
This gives
(11.46)
Proofs .
•
1. In terms of the transformed vectors v and v' , the distortion can be written as
(11.47a)
• •
D ""tTr[A-IFARl (11.53)
where F and ~ do not depend on A. Minimizing D with respect to A, we obtain
(see Problem 2.15)
diagonal. But ARA *T is also diagonal, Therefore, these two matrices must be
related by Ii diagonal matrix G. as "
t Note that <r.(k) is independent of the transform A.
•
Remarks
Not being a fast transform in general, the KL transform can be replaced either by a
fast unitary transform, such as the cosine, sine, DFf, Hadamard, or Slant, which is
not a perfect decorrelator, or by a fast decorrelating transform, which is not unitary.
In practice, the former choice gives better performance (Problem 11.9).
The foregoing result establishes the optimality of the KL transform among all
decorrelating transformations. It can be shown that it is also optimal among all the
unitary transforms (see Problem 11.8) and also performs better than DPCM (which
can be viewed as a nonlinar transform; Problem 11.10).
The transform coefficient variances are generally unequal, and therefore each
requires a different number of quantizing bits. To complete the transform coder'
design we have to allocate a given number of total bits among all the transform
coefficients so that the overall distortion is minimum. Referring to Fig. 11.15, for
.any unitary transform A, arbitrary quantizers, and B = A-I = A H; the distortion
becomes
1 N -I 1 N-l
D =N .2.,0 E[lv(k) - 2
v (k)1 ] = N.~o (TU(n.) (11.56)
where (T~ is the variance of the transform coefficient v (k), which is allocated nk bits,
and I('), the quantizer distortion function, is monotone convex with f(O) = 1 and.
f( co) =d 0. We are given a desired average bit rate per sample, B; then the rate for the
A-transform coder is .
•
1 !V-I .
RA~- L nk=B (11.57)
N k=O •
. The bit allocation problem is to find nk ~ 0 that minimize the distortion D, subject
to (11.57). Its solution is given
•
by the following
•
algorithm, '.
•
."
. Bit Allocation Algorithm
Step 1.' Define the inverse function off'(x) ~ df(x)/dx as hex) ~f H (x), or
h if I (z) =.x. Find .e,the' root of the nonlinear equation. ,
•
The solution may be obtained by an iterative technique such as the Newton method.
The parameter II is a threshold that controls which transform coefficients are to be
coded for transmission.
Step 2. The number of bits allocated to the kth transform coefficient are •
given by .
n - o, crf<O (11.59)
• k - h (Of'(O)/cr rr f ;'2: 0 n,
Note that the coefficients whose mean square value falls below 0 are not coded at
all. •
(11
11k = max 0, ~log2 9 (H.61) .
•
. 1 N-l
1
V=- L: crl+ L: I)
N ..i<8
= -N L: min(6,(1i) (11.62)
..i>9 k<O
1
R,.=-
N
(11.63)
1N~1 (I (11
= N k~O max ,0, '2 log26
502 Image
. Data
. Compression Chap. 11
In the case of the KL transform, (J'i .,., Ak and ITk Ak'" IRI, which gives
•
(11.65)
,
where R'" {rem, n)/(J'~} is the correlation matrix of D, and O'~ are the variances of its
elements.
Example 11.3
The determinant of the covariance matrix R = {plm - n~ of a Markov sequence of length
N is IRI ". (1 -ll)N-t. This gives
D <min{A.} (11.68)
For N ". 16 and p = 0.95, the value of min{A.} is 0.026 (see Table 5.2). So for D "'0.01, '
we get R KL '" 1.81 bits per sample. Rearranging (11.68) we can write
I (1 - p2) 1 ( 2)
R KL = Z 1082 D - 2N 1082 1 - P (11.69)
As N ..... oo, the rate RK L goes down to a lower bound R Kd oo)=410g2 \ I - p2) 1D,
and RpCM- RKdoo ) = -41082 (1- p~ = 1.6 bits per sample. Also" as N ..... IX>, the
eigenvalues of R follow the distribution A(OO) = (1- p2)/(1 + p2 + 2p cos 00), which
gives min{A.} = (1 ..., p2)/(1 + p)2 = (1- p)/(1 + p). For p =0.95, D = 0.01 we obtain
R KL (oo) ". 1.6 bits per sample. ,
Integer Bit Allocation Algorithm. The number of quantizing bits nk are often
specified as integers. Then the solution of the bit allocation problem is obtained by
applying a theory of marginal analysis [6, 21], which yields the following simple
algorithm.
,
Il k .1. u i[f (ni- 1) - f (ni- 1+1)] .
is maximum. .1.k is the reduction in distortion if the jth bit is allocated to the kth
•
quannzer.
If ties occur for the maximizing index, the procedure is successively initiated
with the allocation n~ = ni< -1 + o(i - k) for each i, This algorithm simply means that
the marginal returns .
are arranged in a decreasing order and bits are assigned one by one according to this
order. For an average bit rate of B, we have to search N marginal returns NB times.
This algorithm can be speeded up whenever the distortion function is of the form
f(x) = a2- bx • Then Ak •i = (1 - 2-b)crU(n~-I), which means the quantizer having the
largest distortion, at any step j, is allocated the next bit. Thus, as we allocate a bit,
we update the quantizer distortion and the step 2 of the algorithm becomes:
Step 2: Find the index i such that '
D; = mp-x[crU(n{-l)] is maximum
Then
nk = ni- 1 + o(k - i)
D;=2- b D;
The piecewise exponential models of Table 4.4 can be used to implement this step.
,
11.5 TRANSfORM CODING OF IMAGES
• •
The foregoing one-dimensional transform coding theory can be easily generalized.
to two dimensions by simply mapping a given N x M image u(m, n) to a one-
dimensional NM x 1 vector u, The KL transform of u would be a matrix of size
NM x NM. In practice, this transform is replaced by a separable fast transform such
as the cosine, sine, Fourier, Slant, or Hadamard; these, as we saw in chapter 5, pack
a considerable amount of the image energy in a small number of coefficients.
To make transform coding practical, a given image is divided into small rectan-
gular blocks, and each block is transform coded independently. For an N x M
image divided into ;>'iMIpq blocks, each of size p x q, the main storage require-
ments for implementing the transform are reduced by a factor of NMlpq. The
computational load js reduced by a factor of log2MN/log2pq for a fast transform
requiring aN log2Noperations to transform an N x 1 vector. For 512 x 512 images
divided into 16 x 16 blocks, these factors are 1024 and 2.25, respectively. Although
the operation count is not greatly reduced, the complexity of the hardware for
implementing small-size transforms is reduced significantly. However,. smaller
block sizes yield lower compression, as shown by Fig. 11.16. Typically, a block size
of 16 x 16 is used.
• .
Two-Dimensional Transform Coding Algorithm. We now state. a practical.
transform coding algorithm for images (Fig. 11.17).
•
•
Figure 11.16 Rate achievable by block KL transform coders for Gaussian random
fields with separable covariance function, p = p, =0.95, at distortion D = 0.25%.
q
• .
p Vi Vi
u, ApUiA~ Quantize Code XMIT/Store
•
• I
,'. , I
(a) Coder
I
I •
•
, ,
q
: r
•
V, •
p
Decode A,TV'A;
p ,
I• Ui •
(b} Decoder
1. Divide the given image. Divide the image into small rectangular blocks of size
. p x q and transform each block to obtain Vi, i "" 0, ... ,I - 1, I L'..NMIpq.
2. Determine the bit allocation. Calculate the transform coefficient variances cr~, I
.. via (5 .3!i) gf Problem 5.29b if the image covariance function is given. AIterna-
.. tively, estimate the variances (H.I from the ensemble of coefficients Vi (k, I),
i ;: 0, ... ,I - 1, obtained from a given prototype image normalized to have
unity variance. From this, the (J' II for the image with variance (J' 2 are estimated
These coefficients are quantized by an nk, rbit quantizer, which is designed for
zero mean, unity variance inputs. Coefficients that are allocated zero bits are
,
j ~
8 7 6 5 3 3 2 2 2 1 1 1 1 100
765~33221 1 1 1 1000.
6 5 433 222 1 1 1 1 1 000
5 4 3 3 3 2 221 1 1 1 100 Q
333 3 2 221 1 1 1 1 0 0 0 0
3322222111110000
2 2 2 2 2 2 1 1 j 1 1 000 0 0
•
f 2.2221 1 1 1 1 1 1 00000
2 1 11 1 1 111 10 0 0 0 0 0
1 1 1·1 1 1 1 1 1 10000000
1 1 111 1 1 1 000 0 0 0 0 0 Figure lull Bit allocation for 16 x 16
1 1 1 1 1 1 000 0 0 000 00. block cosine transform coding. of images
1 1 1 100 0 0 0 0 boO 0 0 0
lOOP 0 0 0 0 00,000000 modeled by isotropic covariance function
0000000000000000 with p =0.95. Average rate > 1 bit per
0000000000000000
•
pixel.
Once a bit assignment for transform coefficients has been determined, the
performance of the coder can be estimated by the relations
p-lq-l p-lq-l
q 2:
D =p..L
k=OI~O
2: a~.d(nk.I)' RA =-jq 2: 2: k~OI=O
nk,1 (11.74)
A. Fourier
t
•
•
•
• •
•
,F'gure l1.t9 Distortion versus rate
characteristics for different transforms
0.25 0.50 1.0 2.0 _,4.0 " for a two-dimensional isotropic random
Rate (bits/sample) field,
Sec. 1'.5
I' I
Transform Coding of Images
•
507
•
. TABLE 11.3 SNR Comparisons of Various Transform Coders for Random Fields
•
with Isotropic Covariance Function, p = 0.95
•
When plotted as a function of N (Fig. 11.16), this shows the block size of 16 x 16 is
suitable for p = 0.95. For higher values of the correlation parameter, p, the block size
should be increased. Figure 11.20 shows some 16 x 16 block coded results.
Example 11.6 Choice of covariance model
•
The transform coefficient variances are important xfor designing the quantizers. Al-
though the separable covariance model is convenient for analysis and design of
transform coders, it is not very accurate. Figure 11.20 shows the results of 16 x 16
cosine transform coders based on the separable covariance model, the isotropic covar-
iance model, and the actual measured transform coefficient variances. As expected,
the actual measured variances yield the best coder performance. Generally, the
isotropic covariance model performs better than the separable covariance model.
nt (k, l) = 1, k, 1 E If ,
(11.76)
•
0, otherwise
. t<,., -
,,
, ObI>
tal separable, SNR' = 37.5 etB, the right side shows (b) isotropic, SNR' = 37.8 dEl
error Images
, .
•
leI measured covariance, SNR' = 40.3 dEl
1 1 1 1 0 •
1 1 1 0 0 I 1 1 1 I 0
1 1 ',0 0 0 1 1 1 0 0
I ,,
1 0 0 Q 0 1 I,,, 0.\ 1 1 0
,
0 0 0 0 0 0 0 0 0 0
. ~ ,
- ....c·, ..
,
, '
can change because 1/, the set of largest amplitude coefficients, need not be the
same for different blocks. The samples retained are quantized by a suitable uniform
quantizer followed by an entropy coder.
For the same number of transmitted samplestor quantizing bits), the thresh-
old mask gives a better choice of transmission samples (that
, , , is, lower distortion).
However, it also results in an increased rate because the addresses of the
transmitted samples, that is, the boundary of the threshold mask, has to be coded
for every image block. One method is to run-length code the transition boundaries
in the threshold mask line by line. Alrerratively, the two-dimensional transform
coefficients are mapped into a one-dimensional'sequence arranged in a predeter-
mined order, such as in Fig. 11.21c. The thresholded sequence transitions are then
run-length coded. Threshold coding is adaptive in nature and is useful
, for achieving
high compression ratios when the image contents change considerably from block to
block so that a fixed zonal mask would be inefficient.
For first-order AR sequences and for certain random fields represented by low-
order noncausal models, fast KL transform coding approaches or exceeds the data
•
compression efficiency of block KL transform coders. Recall from Section 6.5 that
an N x 1 vector u whose elements u (n), 1 s 11 S N, come from a first-order, sta-
tionary, AR sequence with zero mean and correlation p has the decomposition
b
u=UO+u (11.79)
b
where u is completely determined by the boundary variables u(O) and u(N + 1)
(see Fig. 6.8) and UO and u b are mutually orthogonal random vectors. The KL trans-
form of the sequence {uO(n), 1 <:: n <: N} is the sine transform, which is a fast trans-
form. Thus (11.79) expresses the N x 1 segment of a stationary Markov process as a
two-source model. The first source has a fast KL transform, and the second source,
has only two degrees of freedom (that is, it is determined by two variables).
Suppose we are given the N + 2,elements u (n), 0 S n S N + 1. Then the N x 1
sequences uO(n) and ub(n) are realized as follows. First the boundary variables u(O)
and u(N + 1) are passed through an interpolating FIR filter, which gives ub(n), the
best mean square estimate of u (n), 1 s n <: N, as '
(11.80)
Then, we obtain the residual sequence ,
'uO(n)~u(n)-ub(n), i<:nsN (11.81)
Instead of transform coding the original (N + 2) x 1 sequence byits KL trans-
form, UO and u, can be ceded separately using three different methods [6, 27].. One
of these methods; called recursive block coding, is discussed here.
• ,
u'(OI +
I:
-.
"
II
\If
-
V
Ouantizer
-
V-
0 " -
u'"
•
•
•
.,0-' "
0
•
•
ulN+ 11 uW~ 11
Quantizer , ' "
"
, "
•
,
Delav
•
u"lN+lI •
N+1
,,
, (el Coder;,
-v •
' -<i
II
,
'It-I
•
.
• + ,
u'(N+ 11 Delay u'IO) l;'\. II ,
i
N+l
,
• +
0
,
"
• ub
• • ,
• ",0- 1
• , "
"
0
,
•
u'IN + 11 ,
, '~
,
Ib)Decode~ •
Figure 11.22 Fast KL transform coding (recursive block coding). Each successive
block brings (N + I) new pixels, . .
Remarks
•
The FIR filter aQ-! can be shown to be approximately the simple straight-line
interpolator when p is close to 1 (see Problem 6.13 and [27]). Hence, ub can be
viewed as a low-resolution copy obtained by subsampling and interpolating the
original image. This fact can be utilized in image archival applications, where only
the low resolution image is retrieved in search mode and the residual image is called
once the desired image has beel). located. This way the search .process can be
speeded up.
10°
•
Conventional
- Kl.Tcoder
KLT coder
(RBe) first
•
•
•
The foregoing theory can also be extended to second-order and higher AR
models [27]. In these cases the KL transform of the residuals uG(n) is no longer a fast
transform. This may not be a disadvantage for the recursive block coding algorithm
because the transform size can now be quite small, so that a fast transform is not
necessary.
In two dimensions many noncausal random field models yield fast KLT de-
compositions (see Example 6.17). Two-dimensional fast KLT coding algorithms
similar to the ones just discussed can be designed using these decompositions.
Figure 11.24 shows a two-dimensional recursive block coder. Figure 11.25 compares
results of recursive block coding with cosine transform coding. The error images
show the reduction in the block effects..
Two-Source Coding
I I I I I
I I I I I
--+-------t-------~-------J-------4--
I I I I I
I I I I I
I I 1 . I II' _
I I I I
I I \ I :
I I, _ .I 1I I
--i--------.... --~---- +- - - - - - - - - - - - - - - - - -
I I I I I
1 1 I I
I I ! ! I
II I l>m%1 I
I I I ~~ I I
I I 1 I I
I I D, I I I Data already coded
.--+- -.---- - t- - - ~ - - _..L - - - - - --1- -_ - l. __
1
I ,I r--,::=:::;--:I--;::=::;--t-------..L-
I I
I I I
I D I K I b 1<+1 I
I 21 ,4 I •
+ !
' I - I
I
I I
I I
,i
__ L L -b,-~-----...J
I mage block ,
U + U me I U-0'
"-
- transform
coder
- u·•
,,
- • •
1>3' b, Il:.. b. Filter
Quantiz'e \R-1
- -
, .
• • -
1"'- #j.,'{ • ~ b,. Il:. •
- Delays -
•
•
•
Sec. -l1.5. Transform Coding of Images 513
•
~--~~}~
".
1
,
,
, ¥ • - '
,,',--,,','~?
·.i .
J ,~'H% "'_'_"~ '01 , ,; A
!\
•
,
Input signal Synthetic highs Synthesillld edge
lal
Corner point
(bl
Figure Il.U Two-source coding via (a) synthetic heights; (b) detrending.
From Section 3.3 we know that a weighted mean square criterion can be useful for
visual
, evaluation of images. An FFT coder that incorporates this criterion (Fig.
11.27) quantizes the transform coefficients of the image contrast field weighted by
H(k, I), the sampled frequency response function of the visual system. Inverse
weighting followed by the inverse FFT gives the reconstructed contrast field. To
apply this method to block image coding, using arbitrary transforms, the image
contrast field should first be convolved with h (m, n), the sampled Fourier inverse of •
H(~l ,~). The resulting field can then be coded by any desired method. At the
receiver, the decoded field must then' be convolved with the inverse filter whose
frequency response is 1/H (~l , ~). .
•
Luminance
J
I
.I
Lim, n)
•
I
.! fl • )
luminance to
u FUI'
V Vi,ual ' : V·
F*V F""
U· r'l')
contrast to
'I L'I m,n I
quantizes- I
contrast I luminance
•
.I
Encoder
I
I Decoder
f
,
vlk, II Frequency wik, /l MMSE w'{k,n Inverse v'ik, n
weighting quantizer weighting
H{k,ll
,I .
.
•
lIHlk,/l
Adaptation of the transform basis vectors is most expensive because a new set of KL
basis vectors is required whenever any change occurs in the statistical parameters. A
more practical method is to adapt the bit assignment of an image block, classified
into one of several predetermined categories, according to the spatial activity (for
instance, the variance of the data) in that block [l(c), p. l285J. This results in a
variable average rate from block to block but gives a better utilization of the total
bits over the entire ensemble of image, blocks. Another adaptive scheme is to
allocate bits to image blocks so that each block' has the same distortion [29]. This
results in a uniform degradation of the image and appears less objectionable to the
eye.
In adaptive quantization schemes, the bit allocation is kept constant but the
quantizer levels are adjusted according to changes in the variances of the transform
coefficients. Transform domain variances may be estimated either by updating the
statistical parameters of the covariance model or by local averaging of the squared
magnitude of the transform domain samples. Examples of adaptive transform cod-
ing are given in Section 11.7, where we consider interframe transform coding.
finite-order causal predictors may never achieve compression ability close to trans-
form coding because a finite-order causal representation of a two-dimensional
random field may not exist. From an implementation point of view, predictive
coding has much lower complexity both in terms of memory requirements and the
number of operations to be performed, However, with therapidly decreasing cost
of digital hardware and computer memory, the hardware complexity of transform
coders will not remain a disadvantage for very long. Table 11.4 summarizes the
One-dimensional 2-4
Two-dimensional 4-8
'Iwo-dimensional adaptive 8-16
Three-dimensional 8-16
Three-dimensional adaptive 16-32
,
V.11l . v~l1I ,
OPCM Filter ,
\
v.121 v~121 ,
OPCM -;... Filter •
u;121
•
+ •
Channel
•
, ~-t
•
•
• • • •
• • • •
VaiN) V; IN) •
DPCM Filter
Hybrid Coding Algorithm. Let u., n "" 0,1, ... , denote N x 1 columns of
an image, which are transformed as .
n=O,1,4, ... (11.83)
For each k, the sequence V n (k) is usually modeled by a first-order AR process [32],
as
v. (k) = a(k)vn-I (k) + b(k)en (k),
(11.84)
E[e. (k)e., (k')] = <T~(k)~(k - k')8(n - n')
The parameters of this model can be identified from the covariances of v. (k),
n - 0, 1, ... , for each k (see Section 6.4). Some sernicausal representations of
images can-also be reduced to such models (see Section 6.9). The DPCM equations
for the kth channel can now be written as
Predictor: v~ (k) = a(k)v~_l (k) (11.85a)
. _ (k') ~ v. (k) - v~ (k)
Quantizer input: en - b(k)
Assuming that all the DPCM channels are in their steady state, the average mean
square distortion in the coding of any vector (for noiseless channels) is simply the
'average of distortions in the various DPCM channels, that is, .
IV
-1 ~ 2 A [(x)
D- N /::/ dnk)l1 e(k), gk(x)=1_Ia(k)]2/(x) (11.87)
where I(x) and gk (x) are the distortion-rate functions of the quantizer and the kth
DPCM channel, respectively, for unity variance prediction error (see Problem .
11.6). The bit allocation problem for hybrid coding is to minimize (11,.87) subject to .
(1l.86). This is now in the framework of the problem defined in Section 11.4, and
the algorithms given there can be applied.
Example 11.7
Suppose the sernicausal model (see section 6.9, and Eq. (6.106))
•
•
uim, n);a[u(m -I,n) +u(m + I,n)] +.yu(rn,n-1)+ e(rn,n),
uim, 0) = 0, Vm
~[E(m, n)] = 0, £[£(m, n)£(i,j)] = (3'8(m - i, n - j)
is used to represent an N x M image with high interpixel correlation. At the bound-
aries we can assume u(O, n) = u(1, n) and u(N, n) = u(N + 1, n), With these boundary
conditions, this model has the realization of (11.84) for cosine transformed columns of
the image with a(k) ~'{IA(k), b(k) ~ l/A(k), <T~(k) '" 13 , ~(k) 4 1 _ 20: cos (k -l)'TrIN,
2
3, 3, 3, 2, 2, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0
Thus, only the first eight cosine transform coefficients of each 16 x 1 column are used
for DPCM coding. Figure 11.29a shows the result of hybrid coding using this model.
j4it4!lttrq"~""!ir -
-" '
• •
,;
f
~I
, ~
!
.-,.
¥,'.'
r
!,
,
, •
t •".-
,;~
~"w_, ..,'",,"'"'__ -'>!""""'_,;":.-,,'.,.. -;"',i'r.,., 'M0'''' ¢---,-""
(a) (b)
, •
•
Figure 11.29 Hybfid encoded images at 1 bit/pixel. (a) Nonadaptive; (b) adaptive classification.
Frame Repetition
Beyond the horizontal and/or vertical line interlace methods discussed in Section
11.1, a simple method of interframe compression is to subsample and frame-repeat
interlaced pictures. This, however> does not produce good-quality moving images.
An alternative is selective replenishment, where the frames are transmitted at a
reduced rate according to a fixed, predetermined updating algorithm. At the re-
ceiver, any nonupdated data is refreshed from the previous frame stored in the
frame memory. This method is reasonable for slow-moving areas only.
·Resolution Exchange •
.
The response of the human visual system is poor for dynamic scenes that simulta-
neously contain high spatial and temporal frequencies. Thus, rapidly changing areas
of a scene can be represented with reduced amplitude and spatial resolution when
compared with the stationary areas. This allows exchange of spatial resolution with
temporal resolution-and can be used to produce good-quality images at data rates of ..
2-2.5 bits per pixel. One such method segments the image into stationary and
. moving areas by thresholding the value of the frame-difference signal. In stationary
areas frame differences are transmitted for every other pixel and the remaining
pixels are repeated from the previous frame. In moving areas 2 : 1 horizontal sub-
sampling is used, with intervening elements restored by interpolation along the scan
lines. Using 5-bits-per-pixel frame-differential coding, a channel rate of 2.5 bits per
pixel can be achieved. The main distortion occurs at sharp edges moving with
moderate speed.
Conditional Replenishment
This technique is based on detection and coding of the moving areas, which are
replenished from frame to frame. Let u(m. n, i) denote the pixel at location (m, n)
in frame i. The interframe difference signal is
e(m, n, i) = uim, n, i) - u' (m, n, j - 1) (11.88)
where u' (m, n, i-I) is the reproduced value of u(m, n, i-I) in the (i - l)st frame.
Whenever the magnitude of e(m, n, i) exceeds a threshold 1'}, it is quantized and
coded for transmission. At the receiver, a pixel is reconstructed either by repeating
the value of that pixel location from the previous frame if it came from a stationary
area, or it is replenished by the decoded difference signal if it came from a moving
• •
area, grvmg
utm,n, i) = uim; n, i-I) + e'(m, n, i), if le(m,n, i)1 >1'} (11.89)
u' (m, n, i-I)" . otherwise
For transmission, code words representing the quantized values and their
addresses are generated. Isolated points or very small clusters of moving areas are
ignored to make the address coding scheme efficient. A reasonable-size buffer with
appropriate buffer-control strategy is necessary to achieve a steady bit rate. With
insufficient buffer size, its control can require extreme action (such as stopping the
. coder temporarily), which can cause jerky reproduction of motion (Fig. 11.3Oa).
Simulation studies [6, 39} have shown that with a suitably large buffer a I-bit-per-
pixel rate can be achieved conveniently with an average SNR' of about 34 dB (39 dB
in stationary areas and 30 dB in moving areas), Figure 11.30b shows an encoded
image and the encoding error magnitudes for a typical frame.
\
,
t
-I
, ':~
i
;,'082
J
~
DBD • \
\
. ..,-
l
J
'
, .
• •
.~~--
" "
, . , ,
•
•
.. o.
"
,,
,
-~
-/ i/
pf-
, . '
J £ _,_.
iil'>w;w;, -083
"d;;;?j ¥a\~
•
•
, .
523
•
<,J
<: •
~ •
I t
•
•
•
m
.
IQu¥tntizer o, ,
Raster I) + e e Entropy
scanner
+ IOuantizer OM , coder
i
-t I)'
I Ouantizer OR I
•
• Delays and
memories
Sampling
step • Intensity predictors Buffer
predictor , • •
Motion To channel
•
•
•
.
In principle, if the motion trajectory of each pixel could be measured, then, only
the initial frame and the trajectory information would need to be coded. To ,re-
produce the images we could simply propagate each pixel along its trajectory. In
practice, the motion of objects in the scene can be approximated by piecewise
displacements' from frame to frame. The displacement vector•
is used to direct
•
the
'0
assuming the velocity remains constant during the frame intervals. The veloc-
ity vector can be estimated from the interframe data after it has been seg-
mented into stationary and moving areas [41] by minimizing the function
,
. (11.98)
where caP denotes the correlation between auliJn and iJuliJl', that is,
Cap ~
== If au iluR dx dy for u, ~ == x, y, t (11.100)
'" on at'
This calculation can be speeded up by estimating the correlations
,
as [40] ,
where (p, q) and (p', q ') are the displacement vectors relative to the pre-:
ceding and following frames, respectively. Without motion compensation, we
would set p == q == p' == q' == O. Figure 11.32 shows ,the advantage of motion
compensation in frame skipping. ,The improvement due to motion compen-
sation, roughly 10 dB, is quite significant.
,
'""" . •
;
'.,
(al Frame repetition lor interframe prediction) based on the preceding frame •
•
•
•
,_ .~... k'
....,4'
•._.-.4-
1fC>
-If"
,•
,
t
,
~'
'" ,-.c.
,r
.-
l.ll>\•.,,,,, "~""'''' _ ._.. ''_ d'·""""" '_" __ ~'_·'''0 __ "",.• ,.,••..,.
...... _...
#'
'--, '
-~----,-.,.
~,_.~; @~"
" ' ... ".'
--. ' .. '
2
--.,~
,
f:-~
• "
..--
-
..
•
.-
• .. ....J>.... -
~
. --- '
r.. . -_ ,...
j ,'" "
)" .. , , <
4 • •
" "' •r
• • • " • • ·f • " -- ~,~-~,
•
r
, "
."- •
,,'m
,.,,,,-,-;.;,,*,
," "
• -. - , •
• •
•<' I" •
••
•
,•
... -' ~
i: ,,- •
h
• •• •
• •
."
. ,•
"
• " '
s
•
(b) Frame Interpolation from the preceding and the following frames
! I ' I I
•
2~O
•
I DCr'
...
•
+ •
+ •
ui _ ! (m - p, n - q) Frame
" memory
•
•
Trans Jetton
of bl ock
Motion Transmit
Frame extimator Displacement vector 0', q)
memory Figure 11.33 ' Interfrarne hybrid coding
with motion compensation.
. .'. Fig. 11.33, Results of different interframe hybrid coding methods are shown in Fig.
11.34. These and other results [6, 36J showthat with motion compensation, the
adaptive hybrid coding method performs better than adaptive predictive coding and
adaptive three-dimensional transform coding. However, the coder now requires
two sets of two-dimensional transformations. .
•
•
,
')
i_
•
•
,
• F I
I
I
I 1089
, .
'.
• DSD
,~ .
,
" ,~
•
• •
1
I ••
, •,
, ,
,•
•
, •
•
""
•
• •
.. •
, 7'"
•
••
•
• ,• •
•
•
• •
•
".,70.,,·*01""%
'},:' ~
___...............) II L. ,,.,AlJ2J
Illl 0.125 bit/pixel. adaptive with motion oompensatlon, SNR' = 36.7 dB
530 •
>;
p
•• • "; ,
• 'j',
•
'-• ,',
•
•
,'. ."•"
! -.
,
.,•
I.~"
I Dc;
lal 0.6 bitlpixili. separeble covariancll model, SNR' = 32.1 dB;
;
• •
•
, •
}
.i
.,
.
10Lf ;
. -, yj \
it,-'
•
~
:; ,
;; •
,
J •
I
"k-,,,,,,"rb"1
• asa
lcl 0.5 bit/pixel. measured covariances, adaptive, SNR' = 41.2 dB.
•
Figure 11.35 Interframe transform coding
•
·531
•
which, as expected, performs poorly. Also, the adaptive hybrid coding with motion
compensation performs better than three-dimensional transform coding. This is
because incorporating motion information in a three-dimensional transform coder
requires selecting spacial blocks along the motion trajectory, which is not a very
attractive alternative.
So far we have assumed the channel between the coder and the decoder to be
noiseless. To account for channel errors, we have to add redundancy to the input
by appending error correcting bits. Thus a proper trade-off between source coding
(redundancy removal) and channel coding (redundancy injection) has to be
achieved in the design of data compression systems. Often, the error-correcting
codes are designed to reduce the probability of bit errors, and forsimplicity, equal
protection is provided to all the samples. For image data compression algorithms,
this does not minimize the overall error. In this section we consider source-channel-
encoding methods that minimize the overall mean square error.
Consider the PCM transmission system of Fig. 11.36, where a quantizer gener-
ates k-bit outputs x, E S, which are mapped, one-to-one, into zr-bit (n ?:k)
codewords g. EC. Let 13(') denote this mapping. The channel is assumed to be
memoryless ~nd binary symmetric with bit error probability p.: It maps the set C of
K = 2' possible n -bit 'code words into a set V of 2" possible n -bit words. At the
receiver, h(') denotes the mapping of elements of V into the elements on the real
line R. The identity element of V is the vector 0 ~ [0,0, ... ,0].
The mean square error between the decoder output and the encoder input is given
by
and depends on the mappings 13(') and A(')' From estimation theory (see Section
•
2.12) we know that given the encoding rule 13, the decoder that minimizes this error
is given by the conditional mean of x, that is,
•
~
•
yh(V)= 2.: xp(xlv)=E[xlvl
• (11.107)
xES
where p (x Iv) is the conditional density of x given the channel output v. The function
A(v) need not map the channel output into the set S even if n = k..
•
•
532 Image Data Compression Chap. 11
,
•
TABLE 11.6 Basic Vectors {<Pi< i == 1, ... , k} for (n,kl Group Codes
n r k
. I• k 0 1 2 3 4
1 10000000 100000001 1000000011
10000000110
2 01000000 010000000 01000000101
0100000001
3 00100000 001000000 0010000000
00100000011
4 00010000 0001 ()()aOO 0001000000 000 lOOOO111
5 00001000 000010000 0000100000 00001000000
6 00000100 000001000 0000010000 00000100000
7 00000010 000000100 0000001000 00()OOO10000
8 00000001 000000010 0000000100 00000001000
1 1000000 10000001 100000011 1000000110 10000001110
"
2 0100000 01000000 010000001 0100000101 01000001010
3 0010000 00100000 001000000 0010000011 00100000101
4 7 0001000 00010000 000100000 0001000111 00010000011
5 0000100 00001000 000010000 0000100000 00001000000
6 0000010 00000100 000001000 0000010000 OOODOllJOOOO
7 0000001 00000010 000000100 0000001000 00000010000
1 100000 1000001 10000011 . 100000110 . 1000001110
2 010000 0100000 01000001 010000101 0100001010
3
6 001000 0010000 00100000 001000011 0010000101
4 000100 0001000 00010000 OOOlOOl11 0001000011
s . ,
000010 0000100 00001000 000010000 0000100000
'6 000001 0000010 00000100 000001000 0000010000
,1 10000 100001 1000011 10000110 100001110
. 2 . 01000 010000 0100001 01000101 010001010
3 5 00100 001000 0010000 00100011 001000101
4 00010 000100 0001000 00010111 000100011
5 00001 000010 0000100 00001000 000010000
.
1 1000 10001 100011 1000110 10001110
2 4
0100 01000 010001 0100101 01001010
3 0010 00100 00100Q, 0010011 OOlQ0101
4 0001 00010 000100 0001111 00010011
1 100 1001 10011 100110 1001110
2 3 010 0100 . 01001 . 010101 0101010
3 001 0010 00100 001011 0010101
, 1 2
10 101 1011 10110 010111
2 01 010 0101 01101 101110
1 1 1
. .
,11 111 1111 11111
• ., 1"- f .." . •
n""l1 , k=6 .
•
I ..... 1 3 2 4 5 6
g, ..... 10000011101 0100001010000100001011 000100001100 00001000001
• .
00000100000
where I EEl denotes exclusive-OR summation and, denotes the binary product. The
codes generated by this method are called the (n, k) group codes.
Example 11.8
Let n =4 and k =2, so that n -k=2. Then ~I = [1 0 1 1], </12=[0 1 0 1],
and 13(') is given as follows .
x b g= f3(b)
0 0 0 0 0 0 0='0' $1 EEl 0 . +>
1 0 1 0 1 0 1 = o· $1 EEl 1 . $~ ,
2 1 1 0 1 l=l·$IEElO·+>
3 1
°1 1 1 1 o=~, EEl $,
In general the basis vectors ~i
depend on the bit error probability p, and the source'
probability i&tribution. For other distributions, Table 11.5 is found- to lower the .
channel codrtig performance only slightly for P. <l! 1 [39]. Therefore, these group codes
are recommended for all mean square channel coding applications.
•
If lJe and lJq denote the channel and the quantizer errors, we can write (from Fig.
11.36) the input sample as '
2: =X + 'l1q''''' Y + lJe + lJq (11.109) .
This gives the total mean square error as
<,; = E[(z - y)2] = E[(lJe + lJq)2] (11.110)
, For a fixed channel coder 13('), this error is minimum when [6,39] (1) lJe and lJq are
•
orthogonal and (2) u~ ~ Efl1~] and u;
~ E[lJ~] are minimum. This requires
y A A(V) = E[xlv] (11.111)
, x = a(z) = E[zlz E ..:Til (11.112)
-.
where 5;, i = 1, ... , 2k denotes the lth quantization interval of the quantizer,
This result says that the optimum decoder is independent of the optimum
quantizer, which is the Lloyd-Max quantizer, Thus the overall optimal design can be
accomplished by optimizing the quantizer and the decoder individually. This gives
u; = u; + u; , (11.113)
Let f(k) and c(n, k) denote the mean square distortions due to the k-bit
Lloyd-Max quantizer and the channel, respectively, when the quantizer input is a
unit variance random variable (Tables 11.7 and 11.8). Then we can write the total
~ 1 2 4 5 6 7 8
w
...... Gaussian 0.3634
0.3634
0,1175 0.0095 2.5 10--3
X
4.1 X 10-3
6.4 X 10- 4
1.06 X 10- 3
1.6 X 10- 4
2.7 X 10-4
4 x 10- 5
7 x 10-5
Laplacian 0.1762 0.0154
03 Uniform 0.2500 0.0625 Q.nX 10-4 2.44 x to-> 6.1 x 10- 5 1.52 x 10- 5
----_. --_. - - _. - - - -- --
0.0039
--- - --------------------
TABLE 1'1.8 Channel Distortion c(n, k) for In,k) eiock Coding of Outputs of a Quantizer with Unity Variance Input
' ..
• , . .
. where 7Jq is now the DPCM quantizer noise and h (i, j) is the impulse response of the
reconstruction filter. .
•
-6
-8
- -10
ec
~in, n)
-'"e
~
'"•
In,
-12
:::;
-14
-16
,
-18
,,
4 5
Rate (bits)
,
•2
loop. Recall that high compression is achieved for small values of 13 • Equation
(11.117) shows that the higher the compression, the larger is the channel error
amplification. Visually, channel noise in DPCM tends to create two-dimensional
patterns that originate at the channel error locations and propagate until the
reconstruction filter impulse response decays to zero (see Fig. 11.38). In line-by-line
DPCM, streaks of erroneous lines appear. In such cases, the erroneous line ,can be
replaced by the previous line or by an average of neighboring lines. A median filter
operating orthogonally to the scanning direction can also be effective.
To minimize channel error effects, cd given by (11.117) must be minimized to
find the quantizer optimum allocation If = ken) for a given overall rate of n bits per
~el.· .
Example 11.9 .
A predictor with a, = 0.848, a, = 0.755, a3 = -0.608, a. = 0 in (11.35a) and ~' = 0.019
is used for DPCM of images. Assuming a Gaussian distribution for the quantizer input,
the optimum pairs tn, ken)] are found to be:
•
TABLE 11.9 Optimum Pairs [n, k{n)] for DPCM Transmission
pe = 0.01 p, = 0.001
n 1 234 5 1 234 5
ken) 1 111 2 1 222 3
This shows that if the error rate is high (Pe = 0.01), it is better to protect against
channel errors than to worry about the quantizer errors. To obtain an optimum pair,
we evaluate u;!a; via (11.117) and Tables 11.6, 11.7, and 11.8 for different values of k
(for each n). Then the values of k for which this quantity is minimum is found. For a
given choice of (n, k), the basis vectors from Table 11.6 can be used to generate the
transmission code words.
Suppose a channel errpr.causes a distortion QV (k, I) ofthe (k, l)th transform coeffi-
cient. This errormanifests itself by spreading in the reconstructed image in propor-
tion to the (k, l)th basis image, as '
(11.118)
,
"
"',
, ,
, ,
i_'t_'· •
~}r
(a)
,
.~-,""~,,,,_: ....y
••"",c' _
",.'-
".,
: ;.,
II
" '<,
,•
'7&_
r,l
M.A1t'~&;._
.WiIP','f'lkn
'.
'.
4> .' .
. ,
•
'~,
~ (~
Figure 11.38 Two bits/pixel DPCM coding in the presence of a transmission error rate of 10". (a)
Propagation of transmission errors for different predictors. Clockwise from top left, error location:
optimum, three point, and two point predictors. (b) Optimum linear predictor. (c) Two point predictor
. ,')I (A; D) (d) Three-point predictor ')I(A + C - B).
This is actually an advantage of transform coding over DPCM because, for the same
mean square value, localizederrors tend to be more objectionable than distributed
errors. The foregoing results can be applied for designing transform coders that
protect against channel errors. A transform coder contains several PCM channels,
each operating on one transform coefficient. If we represent Zj as the jth transform
coefficient with variance rr 1, then the average mean square distortion of a transform
coding scheme in the presence of channel errors becomes
•
D = 2: uJd(nj) (11.119)
i
where nj is the number of bits allocated to the jth PCM channel. The bit allocation
, algorithm for a transform coder will now use the function d(n), which can be
•
•
•
. ",-
-;~
,I
-" . '. (
•
•
I~
',' i
• • .... I<L"
".",' .....,.J..,.
~
.'
"~
::;~
."
~ "'I
•
••_...._--_.""\.:; ~--"~..,.~ ~
Y'iJ
,
(e) 1 bit/pixel, p = 10-2• without channel (d) 1 bit/pixel. p = 10'2. with channel
error protectioll error protection
Figure 11.40 Transform coding in the presence of channel errors.
539
evaluated via (11.115). Knowing nj, we can find k, = k (nj), the corresponding
optimum number of quantizer bits.
Figure 11.39 shows the bit allocation pattern k o(i, j) for the quantizers and the
allocation of channel protection bits (n (i, j) - k o (i, j)) at an overall average bit rate
of 1 bit per pixel for 16 x 16 block coding of images modeled by the isotropic
covariance function. As expected, more protection is provided to samples that have
large; variances (and are, therefore, more important for transmission). The over-
head due to channel protection, even for the large value of p; = 0.01, is only 15%.
For p, = 0.001, the overhead is about 4%. Figure 11.40 shows the results of the
preceding technique applied for transform coding of an image in the presence of
channel errors. The improvement in SNR is 10 dB at p = 0.01 and is also significant
visually. This scheme has been found to be quite robust with respect to fluctuations
in the channel error rates [6, 39].
,
The need for electronic storage and transmission of graphics and two-tone images
such as line drawings, letters, newsprint, maps, and other documents has been
increasing rapidly, especially with the advent of personal computers and modern
telecommunications. Commercial products for document transmission over tele-
phone lines and data lines already exist. The CCITTt has recommended a set of
eight documents (Fig. 11.41) for comparison and evaluation of different binary
image coding algorithms. The CCITT standard sampling rates (or typical A4 (8Hn.
by Ll-in.) documents for transmission over the so-called Group 3 digital facsimile
apparatus are 3.85 lines per millimeter at normal resolution and 7.7 lines per
millimeter at high resolution in the vertical direction. The horizontal
.
sampling
.
rate
standard is 1728 pixels per line, which corresponds to 7.7 lines per millimeter
resolution or 200 points per inch (ppi). For newspaper pages and other documents
that contain text as well as halftone images, sampling rates of 400 to 1000 ppi are
used. Thus, for the standard 8~·in. by l l-in. page, 1.87 x 10 bits wiII be required at
6
200 ppi x 100 Ipi sampling density. Transmitting this information over a 4800-bit/s
telephone line will take over 6 min. Compression by a factor of, say,S can reduce
the transmission time to about 1.3 minutes.
Many compression algorithms for binary images exploit the facts that (1) most
. pixels are white and (2) the black pixels occur with a regularity that manifests itself
in the form of characters, symbols, or connected boundaries. There are three basic
concepts of coding such images: (1) coding only transition points between black
and white, (2) skipping white, and (3) pattern recognition. Figure 11.42 shows a
convenient classification of algorithms based on these concepts.
•
Run-length Coding
In run-length coding (RLC) the lengths of black and white runs on the scan lines are
coded. Since white (Is) and black (Os) runs alternate, the color of the run need not
•
. . . !of:
~ 7 ! f; Q
~
,-- •
~ 'I n •
~ • •"
--• r .. . •-
"~.,,,
E :' ~
! . ·,.. f;':'
rc> ••
'. ' ;' "':!:
~
~ J!:- •
1 ~ FP· r s , ,:'1,'" !
.. - 11 -e
11:Jlli! > ,,~ 9 :' t;
-g ·" •~ ~ ;1 t
~J.: >-
~ );':l •
lUI• . i I'" • -
.. • !i ••
.. , , ,I· . - F: ;;:
,- ·' -, H;n •
••• ~~ ~
•
,
"~i ii ~L ;, i , • I . """-
'A , < "" ~
i, Wh~ " . :;,,,. ..n' -
<. < '• • ~ .;':; •
i! E ~ , 'I' ,- ~ "
.~
. t:
~ ~
;;:; .. ? . ...., ." f ::
Ii
s ~ ,• -,--"
I -, ., ....... · .. j :
~ ~
a
'
, • ' " t , ~ ~ ~ • § ., . ,·..-.., . ,
t,
> •• ,• ;- ~
" • . j' ~
,",
f~} •
"I ! I :I'; ...• 'I~~
. -
-,!...... i'i ,t
- •
:, ..-,' \.-i
~ n" .
_. " ·,.. , ~
, · "
~
,
~
~
.s
~ ~ .-- i, , I , ~ .'
. f '" ,- !:~2t ,I
-., " " t ';
. .. - r, -i • }{ a ..••- \'.. ~ " ~
•
••• .~ ~
8c:~ js 1",.1 • •
,'~
----'='- "V
I"
~ r
",,-
fti ..." • ... "
•
1::
~3'A
..... --~-------
~
§
"~
J~"'''''I''
"'~' :,~<-,.... r'''''·;'~''I''''l'''''''·~''!!t.!·''l~'''''-;''1"
.. ~< .• ~ 1 ,,~.,., .--"J. ",r '" ~~r·~!~~~r~~~~t~~~r
:~·i;?·;:!1"'i'!';:_.';
....... "!-~".'" .. ~ ',J",
:;:.:,'!!I'.r'~",
t-~ .. __",.", S;;.,
n ~gd .. Hq;dJJf
~.," % - '~'-,,"-'"~-,!lr,,*
I I'' ' - "'0..
1' - ",
,.~;.", '\"". .'i'"
",. ...
.• '. . . . .•
<:,.-;.,. ".,._~} ',1- ~,!, .. ;~1·~i11;111~~'.jZt
-.:i.~~ ~~;j ;~;~l, Pi!; Jle;:~ ~,i;~'::~~HP; 5 1='i~:
'~."'.>"
•..
,i~,··!~~1,(*:
~ ~~.
'" . ....
o
., > ' . , '.... 't"' ~
~~"'~".,~
- ". ~ ",·~, ' •• "- «-"'~.
•• -."'t-"'_1,.,,,, '''-,.,
.;;:-'~:t-' , •. '1'>'::1 , .• ,t;''!1··,t·,'·''''~1~;;t
r tJ ;] -?-
• • ,~., .... ; -," • }?~'""~:;,,,'!f' " '"
{'
7' r, '-.":-". ,',. ° off""," ~"-t---l !~?!,i!:}t!illl;~p~ Pf-J
;:
~
~ ,j-;;
_."
7 l: f r 1 J. t. H'j \ ~l =;(.1.--
1:: t 1..f';
....,.,.~. ''i" ,'" : , t -1,A ~
"." . ."
i!3~','gl;:!i '1F~~1
f" -,: ~~~ ~",:.;,., ;"~>"" •• ": -:;'''111 ~
• " .. :: ~'> '!:; , ' . I "I'" 1 ' .. " ~ ~ 1 I ._ '~:,"'1;~~',:1: s s es s :
~, U "-1:'"'' . ' ,,~¥,,~:!.
"I
t~",:~~:',,";!~ ~n,::.t~I';~;~:·.;;i>::l,n: ec : ~ 1_",,,"
".•!H
~;-""i',.!·~: ,,'~j'..,J;l~~ "Jl'"lr~Z!\3 b ~'I:;,n;~l!
"I i~l'oj,~
1 •• " • • "" 1'- ~" ~" ,,~t~t ~ ~ ' • .," --., _. "1 ..
~ ~.;.
';"~"-;: 1"'~'·~"1~~··'j /1;~~1 """"\·~jf t,·
.- ~.- "'I ',,' ~ ~~. ", . . ~ - ',.
'"j ~,t,,:.\"" " •• "
"'"
\ '1""1"".,} !~,;; ,.' ,.~?!.:." 'J<:l "\ ' ' .... ", .. ..
r rt~, "".,,~
: ";:'l~J"'~- . , ' ' j , ' ' , , ; : ' '~'~"_"~'A ~ :;1'; .,
'.,.. '. '. ", • ., . , > ~,'" " \.I:t;i'~ll ;.~zt:,
i;' .•.!; ~1, ,t" .~ ",,~,,-, , ' ; ; : " - ' l -tl:'''~ t, 'J";";'-·~'~l. }J:i~~lj
, . 2 ; ;~J j 21&';:· \l{::;~';jl'; ~'eJit ; :i";~""'~~""',':" .:''---It •
~ "If--P~3J-'-l: -
:I -~t,,; .. :<r1 ~r, ;;" J : '. 'Ll';t :; ;~~~J;")<~ "~lJf~
ii'g
~ 7 ;- !" · ; '-,,·t '.~,. -l ,_".' ",1 . , , \ .•• ,_,._;~'~)"., l ' ."t r t" '"
,: " ! "'-,. I." "'--': ~I' ," . . . 1,"t~ ~ ,~- -" .... -~ ,)'~ - " {'
' til'''' ~ 't",;,,",)l'" > ;~"~ ~<'<'ll" ,·< •. ~.~_\5;.l., .Z!I,. 3
:14 >"~, ~ , ' • • ~-',.';"' '". ",l,~,~" .":,".,,, .~.,.,
~ ~ \ " " ~ ~ {-- <1,·
"~_-,,"~"""I'~ _~ \,' , "" 'j""'" ~," ..
""\'"',:' ,,~, ''''',"
'
..... "':' ..-t.'" ~,1·.: .. g
'". '; ,J"1"-:
\ ' ' o• ,,'",. 1.
:"~',_ ~~'" ....,,'"1. J"" '" .. ~'~..
"".~~ '1 ~,l~1
'~ , .-;{";:,\"¥r:, Cl .. if "
~ ;: ,,'" - " :~-'! , ,,,,~,,C ''''''1' , , ,, :;·.c·,,~~ ~~1';·,
~
. :;; .•• ..' >'--, " ~"'~"', - 'iH
, " j- "1.,.", .'" .'~"~'
-" " " ' " , It"'. l:.~,,"
,·,t~ "~,,., '-'-
." . '1H~'~,'
-"
!
,,
,"~,)
" "
__, I ) " ' " " ""","
~,
! " , " , _ ,. ~"l"
'>
" ' i.'
'y'i ,•
1 ~'~.} ~ lil·I"r,.;:~1 ·,l~~~f "
,-', 'l> l·"··-~ ~,.,\
0"l' , ·'""~·""l·~'"'·
'~.' 'i.'~"'~ ,~- " J'~ ~ ii!","'; ,.,f,;'. ;7'''''~ -'Ih
:;7> -;, ; ,,; ",_1'1:{ 'l':-i. ,;;1,11 .", " '~·l·.,.. "'--"."
"'.. 1 e 'r. '.. _, ~~." ~,,>,,',, j' "., 1 .. ' .,."
-g .
~,.t .;.-l'i·~
l"""" '-'",". "
" '" -!. " 'i"
,,
., , " >.
I l' ~,
r,,~W" ~,
,
~"
__ , ' '.~ 'I'--:t~'"
~
.;,<
" " ' f '.;' ,',,;
' , c :. , '" "
~ , ",- i~ ",;'~ .... '"
\~."',,\:)
'I~"
.; ;;';~.l ""I! ',,-"
~\·t \>
,•, "'~
",'~ \-,,">" , .... ",'"
.. ,' ,,,,,',>,-""
• ''''}'l' -~-~$
;-"}'·'~~,"·,·l75 .'~.,t,,'''''t.~t .• '~1~' ·'It--V\ yt.ty.-1
~; ~, ' ~,"t ;1:' J. ':Jlh ':; ; , , : " "",'\ 'I ''''1,' l",~)~~ o·
""c- . ""''''-''1 ' N
';,',', :,' ... , <~-, '-",-,'1 ~:~,,'
, ... " ,I " " '~I"
!'" -
." "', .. '<- " ~ ''''-~ _W"'i'l ";',":~
~ T' ~'i '11 "', -;; 1: J.1'J-<
"~ " .• "')'1- l_ ,
." ,"" " " . .
... ;';-
'"' - "
'-ii
~ ~l'
,'~,,:",
""'"
"
<",
~~
• , .
.., . ,
'~'~"'...~ ~~...
~< .•
, "'.~_'#' ~ ~~.,
;;'~l,f,,~-·t, ,'l.l~',
. .. t' ~ ....1' q
<U""
-, I>
::: hll
t
t:.iAIo ....._ e _lilt _rw ....... , '":'.--r---'--;---rr
-• _.,-,--
-..,-,.---,---,=
.-<I "........
.a.-
_
1"..._10 . . . ___
,~\ju r.,..,....."",' _loIo ........._.
i •• 1' l i\OIOt ~'ili,
., .. ~1. r: t••i
,
,
.H-+~-T-
r
'. 1_ '!!i'.
,.
. :t.
,,,
»][' ,
.~
.. • ~." ,«0,. r. ",io
0><"
{"",,4io'
;,to ..."., ",>it .>to) " .............I M" ~
~'_·-I
.f,--f--+"<
~, ".
:;:::-"
,
,,, ~ _, . _ ... i ""... (\At" ;0.;;0;.-,
...
_;·_ ... "-.~_ • ..-r•
,
r..... "~-"'~'-
-")'-'
, .
",
I
,
r.... .,~llh-n ;
• ._-r.. n
•
~
"
...............-
(--,~
,
• .. , ('-_<i'r-.. . . .
•
,
.. - . _- , .... ... -
I, '<, _ <i. ~-
,""'... ......
~~
! .._
_
. . _T._'. . _.. .
... ".,""'" _ 4 _ ........1 ;"1'
........ tliM , .-'" iLAlll .. '~~
"" -""""'" II·.....,,' II .... H _ ·
1:.......
,..
._,,-
14"" "- - 10 .:. ,...,
•• ••
-,_ .. .
_ . - . . . . •.(q..... -
_-
• _10 ....... _
._,
r
-,
~
,
• .~
.;.. ,r_.,.l}~IJ;" oll.l _ ....
_ _"'_"_''''.<:.1'
~,
'-~l/-J.I~ _ ..._ ...... 10'......__
•..-T'........" ....... _10_,."
Document 5 Document 6
! I
,
,
""
f';o
e
C
I
'-;f
I ,
T
T
t
'"
Gl!
ll! ..-- t~
'r- ...
~ll- ,
•
L4'
,
·, -
• "II
H ,
• • " .. '"
•
•
I
i,,
r,
~,-
'»""-"" •
,, d
,J,~
,,
Dooumont7 Document 8
,
542 , •
•
•
•
White block Run-length Relative addressing Predictive Other
skipping coding techniques coding
be coded (Fig. 11.43). The first run is always a white run with length zero, if
necessary. The run lengths can be coded by fixed-length m-bit code words, each
representing a block of maximum run length M - 1, M ~ 2 , where M can be
m
-
Sec. 11.9 . Coding of Two-tone Images 543
TASlE 11.10 Modified Huffman Code Tables for One-dimensional Run-length Coding
nonnegative integer. The codeword for i, has (q + l)N bits, of which the first qN
bits are 0 and the last N bits are the binary representation of r. For example if N == 2,
the A z code for Is (8 == 2 x 3 + 2) is 000010. For a geometric distribution with mean
fJ., the optimum N is the integer nearest to (1 + log, /-L).
Experimental evidence shows that long run lengths are more common than
predicted by an exponential distribution. A better model for run-length distribution
is of the form .
•
c
P(!) == fa' CI. > 0, C == constant .' (11.140)
• •
Example 11.10
Table 11.12 lists the averages 11.. and ILb and the entropies H; and H; of white and black
run lengths for CCITT documents. From this data. an upper bound on achievable
compression can be obtained as . ,
ILw+ ILb
C.... = Ii.. + H; (11.121)
which is also listed in the table. These results show compression factors of 5 to 20 are
achievable by RLC techmques. '
R = (1- PH)(N + 1) + PH
N
N
. (1.1. 122)
= (1 - PN + ~) bits/pixel
Document
number ILw lJ.b Hw n, Cmax
1 156.3 6,8 5.5 3,6 18.0
2 257.1 14.3 8.2 4,5 21.4
3 89.8 8.5 5.7 ,
3,6 10.6
4 39.0 5.7 4.7 3.1 5.7
5 79.2 7.0 5.7 3.3 9.5
6 138.5 8.0 6.2 3.6 14.9
7 45.3 4.4 5.9 3.1 5.6
8 85.7 70.9 6.9 5.8 12.4
•
Relative address coding (RAC) uses the same principle as the PDQ method and
computes run-length differences by tracking either the last transition on the same
line or the nearest transition on the previous line. For example, the transition pixel
Q (Fig. 11.45) is encoded by the shortest distance PQ or Q 'Q, where P is the
preceding transition element on the current line and Q I is the nearest transition
element to the right of P on the previous line, whose direction of transition is the
.same as that of Q. If P does not exist, then it is 'considered to be the imaginary pixel
. to the right of the last pixel on the preceding line. The distance QQ I is coded as +N
•
r, Ivf
. ~ ,......., G'
00111110000110000 111000000000111111 Preceding line
00011111000000110 1111000000011 t0011 Currentline
, 1 I I I P Q
I'I I
I
INsl
I I FIgure 11.45 RAe method. PQ = 1,
I 1--<-'2....,.........'w-t-l
Ii t tr.. t QQ' = -1, RACdistance =-1
•
• A'
FIgure 11.44 The PDQ method: A" =" - r-:
Sec. 11.9 Coding of Two-tone Images •
547
•
TABLE 11.13 Relative Address• Codes. x .•. x '" binary representation of
•
N.
•
•
Distance Code 1'1 F(N)
•
+0 D 1-4 Oxx
+1 100 5-20 lOxxxx
-1 101 21-84 1l0xxxxxx
1'1(1'1>1) 1Il F(N) 85-340 1110xxxxxxxx
+1'1(1'1 >2) 1100 F(N) 341-1364 11 1lOxxxxxx.uxx
-1'1(1'1>2) 1101 F(N) 1365-5460 111110xxxx=xxxxxx
• •• ••
• •
if Q' is N (?:-O) pixels to the left or - N if Q' is N (?:-1) pixels to the right of Q on the
preceding line. Distance PQ is coded as N(?:-l) if it is N pixels away. The RAC
distances are coded by a code similar to the B1 code, except for the choice of the
reference line and for very short distances, +0,+ 1, -1 (see Table 11.13).
. The modified relative element address designate (READ) algorithm has been rec-
ommended by CCITT for two-dimensional coding of documents. It is a modifica-
tion of the RAC and other similar codes [H, p. 854]. Referring to Fig. 11.46 we
define ao as the reference transition element whose position is defined by the
previous coding mode (to be discussed shortly). Initially, ao is taken to be the
imaginary white transition pixel situated to the left of the first pixel on the coding
Reference line
Coding line
-f-I-
ill,
Vertical mode
•
8,0,
I"
I
)101
I b, I
I I •
• Reference line
CQdingline
I a, I
1 a2
-
I
I 1
.. j .. )r I
Horizontal mode
Pass mode. b1 is to the left of aj (Fig. 11.46a). This identifies the white or
black runs on the reference line that do not overlap with the corresponding white or
black runs on the coding line. The reference element ao is set below b 1 in
preparation for the next coding.
Horizontal mode. If !al b 1!> 3, the vertical mode is not used and the run
lengths aOal and al a2 are coded using the modified Huffman codes of Table 11.10.
. After coding, the new position of ao is set at a2 . If this mode is needed for the first
element on the coding line, then the value ao al - 1 rather than ao al is coded. Thus if
the first element is black, then a run length of zero is coded.
•
TABlE11.14 CCITT Modified READ Code Table [H, p, 8651
.
Elements ,
Mode •
to be coded Notation CodeWord
Pass bl> b2 •
P • 0001
,
Horizontal aOal, a.a, , H 001 + M(aoa,) + M(a,a,)
• .
a, Just
under b, alb, = a V(O) 1
a-b, = 1 VR(l) 011
a, to the
a.b, = 2 VR(2) 000011
Vertical right of b, •
a.b, = 3 Vll(3) 0000011
-a.b, = 1 V,Jl) 010
a, to the
a.b, = 2 VL(2) 000010
left of hi
,. alb, = 3 VL(3) 0000010
2-D extensions 000000lxxx
1-D extensions OOOO()(){''Olxxx
End-of-line (EOL) code word 000000000001
•
I-D coding of next line EOL+ /1'
• ,
_ _ _ _ _1 2-D coding of next line
• • • Q'
,EOL + 0 , /
. M(aoa,) and M(a; a,) are code words taken from the modified Huftman code tables given in
Table 11.10. The bit assignment for the xxx bits is 111 for the uncompressed mode.
•
•
t
•
1:'=0
.
, . •
•
No • •
•
set So"" 0
Detect 8 , •
• •
Detect b ,. b 2 •
~
·
No No Detect
b 2 <' ,? 'o,b,I<3? '2
•
y"" Yes •
·
•
No
• 80 :;: 0
Yes
80S 1 = "081 - 1
.
PI\$$ mode Vertical Horizontal
codlo9 mode coding modecoqing
•
•
No
.0 = ,1291
• Ves
•
•
K=K-' \
, No •
End of PlIll"?
• y",
•
end
,
Figure 11.47 CCITf modified READ coding algorithm•
•
550
" The coding procedure along a line continues until the imaginary transition
element to the right of the last actual element on the line has been detected. In
this way exactly 1728 pixels are coded on each line. Figure 11.47 shows the flow
diagram for the algorithm. Here K is called the K -factor, which means that after a
one-dimensionally coded line, no more than K - 1 successive" lines are two-
dimensionally coded. CCfIT recommended values for K are 2 and 4 for documents
scanned at normal resolution and high resolution, respectively. The K -factor is used
to minimize the effect of channel noise on decoded images. The one-dimensional
and two-dimensional extension code words listed in Table 11.14, with xxx equal to
111, are used to allow the coder to enter the uncompressed mode, which may be
desired when the run lengths are very small or random, such as in areas of halftone
images or cross hatchings present in some business forms. "
Predictive Coding
The principles of predictive coding can be easily applied to binary images. The main
difference is that the prediction error is also a binary variable, so that a quantizer is
not needed. If tbe original data has redundancy, then the prediction error sequence
will have large runs of Os (or Is). For a binary image u (m, n), let u (m, n) denote its
predicted value based on the values of pixels in a prediction window W, which
contains some of the previously coded pixels. The prediction error is defined as
e(m,n)= 1, u(m,n)4u(m,n)
0, u(m; n) = u(m, n) • (11.123)
=u(m, n)Et>u (m, n)
The sequence e(m, n) can be coded by a run-length or entropy coding method. The,
image is reconstructed from e(m, n) simply a s ' .
u(m, n) = 'il (m, n) Et>e(m, n) (11.124)
. Note that this is an errorless predictive coding method. An example of a prediction
window W for a raster scanned image is shown in Fig. 11.48.
A reasonable prediction criterion is to minimize the prediction error proba-
N
bility. For an N-element prediction window, there are 2 different states. Let Sk'
N
k = 1, 2, ... ,2 denote the kth state of W with probability Pk and define
•
qk = Prob[u (m, n) = IISk] (11.125)
Then the optimum prediction rule having minimum prediction error probability is
"
u (m n) = 1, if qk 2: 0.5 (11.126)
, 0, if qk < 0.5
If the random sequence u(m, n) is strict-sense stationary, then the various
probabilities will remain constant at every (m, n), and therefore the prediction rule
stays the same. In practice a suitable choice of N has to be made to achieve a
trade-off between prediction error probability and the complexity of the predictor
X, Xo
.
'k .. 0 0 2 6 15 13 800 o 1 1 1 o 0 2 7 13 S'
Prediction errors
t t t t t
i State So I..-'.~·- - 5 - - - - - - - -...·1..·-~-----------
Run
i<>ngth. StateS, 1f....,.....---3--~--------++--------
I' .. . ,
If the random sequence u(m, n) is Markovian with respect to the prediction window
W, then the run lengths for each state S, are independent. Hence, the prediction-
error run lengths for each state (Fig. 11.47) can be coded by the truncated Huffman
code, for example. This method .has been called the Technical University of
Hannover (TUH) code [Ic, p. 1425].
Adaptive Predictors
Adaptive predictors are useful in practice because the image data is generally
nonstationary, In general, any pattern classifier or a discriminant function could be
used as a predictor. A simple classifier is a linear learning machine or adaptive
threshold logic unit (TLU), which calculates the threshold qk as a linear functional of
the states of the pixels in the prediction window. Another type of pattern classifier is
a network of TLUs called layered machines and'includes piecewise linear discrimi-
nant functions and the so-called o-perceptron. A practical adaptive predictor uses a
L
counter C« of L bits, for each state [43J. The counter runs from 0 to 2 - 1. The
adaptive prediction rule is
ifCk'>2L~1
•
it(m n)=[l, (11:128)
, [O, if Ck < 2L - 1
After prediction of a pixel has been performed, the counter is updated as
L-1),
, C = min(Ck+l,2 ifu(m,n)-l (11.129)
k
max (Ck - 1,0), . otherwise
•
The value L = 3 has been found to yield minimum prediction error for a typical
printed page. .
Comparison of Algorithms
Table 11.15 shows a comparison of the compression ratios achievable by different
algorithms. The compression ratios for one-dimensional codes are independent of
the vertical resolution. At normal resolution the two-dimensional codes improve
the .compression by only 10 to 30% over the modified Huffman code. At high
resolution the improvements are 40 to 60% and are significant enough to warrant
the use of these algorithms. Among the two-dimensional codes, the TUH predictive
code is superior to the relative address techniques, especially for text information.
However, the latter are simpler to code. The CCITT READ code, which is a
modification of the RAC, performs somewhat better.
Other Methods
Algorithms that utilize higher-level information, such as whether the image con-
tains a known type (or font) of characters or graphics, line drawings, and the like,
can be designed to obtain very high-compression ratios. For example, in the case of
printed text limited to the 128 ASCII characters, for instance, each character can
be coded by 7 bits. The coding technique would require a character recognition
algorithm. Likewise, line drawings can be efficiently coded by boundary-following
algorithms, such as chain codes, line segments, or splines. Algorithms discussed
,,"" here are not directly useful for halftone images because the image area has been
modulated by pseudorandom noise and thresholded thereafter. In all these cases
special preprocessing and segmentation is required to code the data efficiently.
•
31.10 COLOR AND·MULTISPECiRAI..IMAGE CODING
Data compression techniques discussed so far can be generalized to color and
multispectral images, as shown in Fig. 11.49. Each pixel is represented by a p x 1
•
Figure 11.49 Component coding of color images, For multispectral images the
input vector has a dimension greater than or equal to 2.
vector. For example, in the case of color, the input is a 3 x 1 vector containing the
R, G, B components, This vector is transformed to another coordinate system, .
where each component can be processed by an independent spatial coder.
In coding color images, consideration should be given to the facts that (1) ,
the luminance component (Y) has higher bandwidth than the chrominance compo-
nents (I, Q) or (U, V) and (2) the color-difference metric is non-Euclidean in these
coordinates, that is, equal noise power in different color components is perceived
differently. In practical image coding schemes, the lower-bandwidth chrominance '.
signals are sampled at correspondingly lower rates. Typically, the I and Q signals are
sampled at one-third and one-sixth of the sampling rate of the luminance signal. Use
of color-distance metric(s) is possible but has not been used in practical systems
primarily because of the complexity of the color vision model (see Chapter 3),
An alternate method of coding color images is by processing the composite
color signal. This is useful in broadcast applications, where it is desired to manage
only one signal. However, since the luminance and color signals are not in the same
frequency band, the foregoing monochrome image coding techniques are not very'
efficient if applied directly. Typically, the composite signal is sampled at 3/.c (the
lowest integer multiple of subcarrier frequency above the Nyquist rate) or 4fl<, and
---"")1A., 0 0 )I
C. 8. A.
.
Previous
III
0 C
I> ,iii
8
1/11
A
•
•
------------~
field
'"'----.-.- .....
o. C. 8. 4.
- .. ---
•
8 3 A 3 0 3 C3 8:i A; ,
•
TABLE 11.16 Predictors for DPCM of Composite NTSC Signal. .z-' = 1 pixel delay,
Z-N == 1 line delay, Z-262N = 1 field delay, p (Ieak):s 1.
> >
Sampling
rate Predictor, P(z) •
•
I-D
I-D
2-D >
•
3-D -,
I-D
2-D
2-D
3-D
Rate Average
Components per component rate
• Method
coded Description bits/component/pixel bits/pixel
PCM R,O,B Raw data 8 24
u: , v- , w· Color space quantizer 1024 color cells 10
Y,l, Q I, Q subsampled 8 12
DPCM One-step predictor Y,I, Q 1, Q subsampled 2t03 3 to 4.5
Transform (cosine. slant) Y,I, Q No subsampling Y(1.15 to 2) 2.510 3
•
>
I, Q (0.75 to 1)
Y,I,Q Same as above witll Variable 1 t02
•
Y, U, V adaptive classification
•
11.11 SUMMARY
PROBLEMS
11.1* For an S-bit integer image of your choice, determine the Nth·order prediction error
field £N(m, n) 6. u(m, n) - uN(m, n) where uN(m, n) is the best mean square causal
predictor based on the N nearest neighbors of u(m, n). Truncate u,,(m, n) to the
nearest integer and calculate the entropy of £,v(m, n) from its histogram' for N =
0,1,2,3,4,5. Using these as estimates for the Nth-order entropies, calculate the
achievable compression. .
11.2 The output of a binary source is to be coded in blocks of M samples. If the successive
outputs are independent and identically distributed with p = 0.95 (for a 0), find the
Huffman codes for M = 1,2,3,4 and calculate their efficiencies.
11.3 For the AR sequence of (11.25), the predictor for feedforward predictive coding
(Fig 11.6) is chosen as II (n) ~ pu(n -:-1). The prediction error sequence £(n) ~
u(n) - it (n) is quantized using B bits. Show that in the steady state,
.
, .
a2
E[I&u(n)J2J = 1 q 2 = a~/(B)
-p
•
where a~ = 0-;; (1 - p2)/(B) is the mean square quantization error of e(n). Hence the
feedforward predictive coder cannot, perform better than DPCM because the pre-
ceding result shows its mean square error is precisely the same as in PCM. This result
happens to be true for arbitrary stationary sequences utilizing arbitrary linear pre-
dictors. A possible instance where the feedforward predictive coder may be preferred
over DPCM is in the distortionless case, where the quantizer is replaced by an
entropy coder. The two coders will perform identically, but the feedforward predic-
tive coder will have a somewhat simpler hardware implementation.
11.4 (Delta modulation analysis) For delta modulation of the AR sequence of (11.25),
write the prediction .error as e(n) = e(n) - (1- p)u(n -1) + &e(n -1). Assuming a
l-bit Lloyd-Max quantizer and Seen) to be an uncorrelated sequence, show that
• a;(n) = 2(1- p)cr: + (2p -l)a;(n -l)f(l)
from which (11.26) follows after finding the steady-state value of az/cr ; (n).
Ir--------~-----~
. I
ulm. nl +
+
. Qu.ntiz....
e •
+')
.-. , +
, +
I -y. (m,n}
lI
•
I
ii'
- + 'I I
+ I
I
Plz,. z,l I A(z,. z,l I
+ I
I
+ Channel
, IL _____ ...... _____ ....JI
norse
Coder Decoder
•
Figure PII.S
11.6 (DPCM analysis) For DPCM of the AR sequence of (11.25), write the prediction
error as e(n) = £«(1) + p8e(n -1). Assuming oe(n) to be all uncorrelated sequence,
show that ,the steady-state distortion due to DPCM is .
•
•
For p=O.95, plot the normalized distortion D/fY~ as a function of bit rate B for
B = 1,2,3,4 for a Laplacian density quantizer and compare it with PCM.
11.7* For a 512 x 512 image of your choice, design DPCM coders using mean square
predictors of orders up to four. Implement the coders for B = 3 and compare the
reconstructed images visually as well as on the basis of their mean square errors and
entropies.
11.8 a. Using the transform coefficient variances given in Table 5.2 and the Shannon
quantizer based rate distortion formulas (11.61) to (11.63), compare the dieter-
tion versus rate curves for the various transforms. (Hint: An easy way is to arrange
fT; in decreasing order and let 6 = o J, j = 0, ... ,N - 1 and plot D, ~ lIN
~N-I 2
_k_/O'kversus R1=A 1I2u ",j I 2/ 2,
'~"'k-O Og,<7k,fT/). . .
b. Compare the cosine transform R versus D function when the bit allocation is
determined first by truncating the real numbers obtained via (11.61) to nearest
integers and second, using the integer bit allocation algorithm.
11.9 (Whitening transform versus urUiary transform) An N x 1 vector u with covariance
R={pl"-h~ is transformed as v=Lu, where L is a lower triangular (nonunitary)
matrix whose elements are
• •
•
I, . • 1=/
11•1 -= -p, i -j =1
.
•
• •
• •
. , 0, , otnerwise
•
ISn sN-1. •
For the initial sample E(O), the same distortion level is achieved by using
•
. ~ !og,(I/D) bits. From these and (11.69) show that
(N - 1) pZ D \ (1 - p)
• R DPCM - R KL =
. 2N
log, 1 + ---
(1 - p') ,
I D <: .
(1 + p)
•
•
Calculate this difference for N '" 16, p '" 0.95, and D '"' 0.01. and conclude that
at low levels of distortion the performance of KL transform and DPCM coders
is close for Markov sequences, This is a useful result, which can be generalized
fOJ AR sequences. For ARMA sequences, bandlimited sequences, and two-
dimensional random fields, this dirference can be more significant.
11.11 For the separable covariance model used in Example 11,5, with p = 0,95, plot and
compare the R versus D performances of (a) various transform coders for 16 X'16
size block utilizing Shannon quantizers (Hint: Use the data of Table 5,2.) and (b)
N x N block cosine transform coders with N = 2", n = 1, 2, ... ,8. (Hint: Use eq.
(P5.28--2).]
n.12 Plot and compare the R versus Dcurves for 16 x'16 block transform coding of images
modeled by the nonseparable exponential covariance function O.95 Vm' + n' using the
discrete, Fourier, cosine, sine, Hadamard, slant, and Haar transforms. (Hint: Use
results of Problem PS.29 to calculate transform domain variances.)
11.13* Implement the zonal transform coding algorithm of Section 11. 5 on 16 x 16 blocks of
an image of your choice. Compare your results for average rates of 0.5, 1.0, and 2.0
bits per pixel using the cosine transform or .any other transform of your choice.
11.14* Develop a chart of adaptive transform coding algorithms containing details of the
algorithms and their relative merits and complexities. Implement your favorite of
these and compare it with the 16 x 16 block cosine transform coding algorithm.
11.15 The motivation for hybrid coding comes from the following example. Suppose an
N x N image u(m, n) has the autocorrelation function rtk, 1) = pikl +lil.
a. If each column of the image transformed as v, = <IIun , where 0 is' the KLT
of Un' then show that the autocorrelation of Vn , that is, E[vn (k)vn , (k')] =
Ak pin - n'l o(k - k '). What are <II and Ak? '
b. This means the transformed image is uncorrelated across the rows and show that
the pixels along each row can be modeled by the first order AI<. process of (11.84)
with a(k) - p, b(k) = 1, and cr;(k) = (1- p')io. k •
11.16 For images having separable covariance function with p = 0.95, find the optimum
pairs n, ken) for DPCM transmission over a I!0isy channel with p, = 0.001 employing
the optimum mean square predictor. (Hint: 13' = (1- p')'.) .
11.17 Let the transition probabilities qo = p(OII) and qi = p(lIO) be given. Assuming all the
runs to be independent, their probabilities can be written as
1 ;::; 1, I = O(white), l(blac;;)
a. Show that the average run lengths and entropies of white and black runs are
fL' = ilq, and H,» (-1/qi)[qi log2qi + (1 - q,) 10/52 (1- qi)]. Hence the achievable
compression ratio is iH« PO/fLO + H, PI/It'), where P, = q,/(qo + q,), i = 0, 1 are the
a priori probabilities of white and black pixels.
b. Suppose each run length is coded in blocks of m -bit words, each word represent-
ing the M - 1 run lengths in the interval [kM, (k + I)M - 1], M = 2M , k = 0,
1, ... , and a block terminator code, Hence the average number of bits used for
white and black runs will be m :E%~o(k + I)P[kMsii s ( k +l)M -1],1 "'0,1.·
What is the compression achieved? Show how to select M to maximize it•
•
,
, ,
•
Data compression has been a topic of immense interest in digital image processing.
Several special issues and review papers have been devoted to this. For details and
extended bibliographies:
. 1. Special Issues (a) Proc. IEEE 55, no. 3 (March 1967), (b) IEEE Commun. Tech.
COM-19, no. 6, part I (December 1971), (c) IEEE Trans. Commun. COM-2S, no. 11
(November 1977), (d) Proc. IEEE 68, no. 7 (July 1980). (e) IEEE Trans. Commun.
COM-29 (December 1981), (f) Proc. IEEE 73, no. 2 (February 1985).
2. T. S. Huang and O. J. Tretiak (eds.), Picture Bandwidth Compression. New York:
Gordon and Breach, 1972.
3. L. D. Davisson and R. M. Gray (eds.), Data Compression. Benchmark Papers in Elec-
trical Engineering and Computer Science, Stroudsberg, Penn.: Dowden Hunchinson &
Ross, Inc., 1976.
. ,
4. W. K. Pratt (ed.). Image Transmission Techniques. New York: Academic Press, 1979.
5. A. N. Netravali and J. O. Limb. "Picture Coding: A Review." Proc. IEEE' 68, no. 3
(March 1980): 366-406.
•
6. A. K. Jain. "Image Data Compression: A Review." Proc. IEEE 69, no. 3(March 1981):
349-389. .
7. N. S. Jayant and P, Noll. Digital Coding ofWaveforms. Englewood Cliffs, N.J.: Prentice-
Hall, 1984. .
8. A. K. Jain, P. M. Farrelle, and V. R. Algazi, "Image Data Compression." In Digital
Image Processing Techniques, M. P. Ekstrom, ed. New York: Academic Press, 1984.
9. E. Dubois, B. Prasada, and M. S. Sabri. "Image Sequence Coding." In Image Sequence
Analysis, T. S. Huang, (ed.) New York: Springer-Verlag, 1981, pp. 229-288. .
Section 11.2
For entropy coding, Huffman coding, run-length coding, arithmetic coding, vector
quantization, and related results ofthis section see papers in [31 and:
•
10. J. Rissanen and G. Langdon. "Arithmetic Codic." IBM J. Res. Develop. 23, (March
1979): 149-162. Also see IEEE Trans. Comm. COM-29, no. 6 (June 1981): g5~-867.
11. J. W. Schwartz and R. C. Barker. "Bit-Plane Encoding: A Technique for Source
Encoding." IEEE Trans. Aerospace Electron. Syst. AES-2, no. 4 (July 1966): 385-392•
. 12. A. Gersho. "On the Structure of Vector Quantizers." IEEE Trans. Inform. Theory
.,.,1'"
IT-28 (March 1982): 157-165. Also see vol, IT-2S (July 1979): 373-380.
Section 11.3
For some early workon'predictive coding, Delta modulation and DPCM see Oliver,
Harrison, O'Neal, and others in Bell Systems Technical Ioumal issues of July 1952,
May-June 1966, and December 1972. For more recent work:
•
Section 11.4
For results related to the optimality of theKL transform, see Chapter 5 and the
bibliography of that chapter. For the optimality of KL transform coding and bit
allocations, see [6], [32], and: .
21. A. Segall. "Bit Allocation and Encoding for Vector Sources." IEEE Trans. Inform.
Theory IT-22, no. 2 (March 1976): 162-169. .
,
,, ,
Section 11.5
For early work on transform coding and subsequent developments and examples of
different transforms and algorithms, see Pratt and Andrews (pp. 515-554), Woods
and Huang (pp. 555-573) in [2], and:
22. A. Habibi and P, A. W'mtz. "Image Coding by Linear Transformation and Block Quanti-
zation." IEEE rra'lS. Commun. Tech. COM-19. no. 1 (February 1971): 50-63.
•
The concepts of fast KL transform and recursive block coding were introduced
in [26 and Ref 17, Ch 5J. For details and extensions see [6J, Meiri et al. (pp,
1728-1735) in [Ie], Jain et al, in [8], and:
26. A. K. Jain. "A Fast Karhunen-Loeve Transform for a Class of Random Processes."
IEEE Trans. Commun. COM·24 (September 1976): 1023-1029.
27. A. K. Jain and P. M. Farrelle, "Recursive Block Coding." Sixteenth Annual Asilomar
Conference on Circuits, Systems, and Computers, November 1982. Also see P. M.
Farrelle and A. K. Jain. IEEE Trans. Commun. (February 1986), and P. M. Farrelle,
Ph.D. Dissertation, U.c. Davis, 1988. .
For results on two-source coding, adaptive transform coding, and the like,
see Yan and Sakrison (pp. 1315-1322) in [Ic], Jain and Wang [32J, Tasto and Wintz
(pp, 956-972) in [lb], Graham (pp. 336-346) in [Ia], Chen and Smith in [Ic], and:
•
Section 11.6
• •
Hybrid coding principle, its analysis and relationship with semicausal models, and
its applications can be found in [6, 8], Jones (Chapter 5) in [4], and:
31. A. Habibi. "Hybrid Coding of Pictorial Data." IEEE Trans. Commun. COM-22
(May 1974): 614-626.
32. A. K. Jain and S. H. Wang. "Stochastic Image Models and Hybrid Coding," Final
Report, NOSe contract N00953-77-C·003MJE, Department of Electrical Engineering,
SUNY Buffalo, New York, October 1977. Also see Technical Report #SIPL·79·6, Signal
and Image Processing Laboratory, ECE Dept., University of California at Davis,
September 1979.
33. R. W. Means, E. H. Wrench and H. J. Whitehouse. "Image Transmission via Spread
Spectrum Techniques." ARPA Quarterly Technical Reports ARPA-QR6, QR8 Naval
Ocean Systems Center, San Diego, Calif., January-December 1975.
34. .A. K. Jain. "Advances in Mathematica1 Models for Image Processing." Proc. IEEE 69
(March 1981): 502-528.
BibliographV . Chap. 11
•
Section 11.7
,
For interframe predictive coding, see [5,~, 8J, Haskell et al. (pp. 1339-1348) in [lc],
. Haskell (Chapter 6) in [4], and:
Interframe hybrid and transform coding techniques are discussed in [5, 6, 8, 9],
Roese et al. (pp, 1329-1338), Natrajan and Ahmed (pp. 1323-1329) in [Ic], and:
38. J. A. Stuller and A. N. Netravali. "Transform Domain Motion Estimation," Bell Syst.
Tech. J. (September 1979): 1623-1702. Also see pages 1703-1718 of the same issue for
application to coding.
39. J. R. Jain and A. K. Jain. "Interfrarne Adaptive Data Compression Techniques for
Images." Tech. Rept., Signal and Image Processing Laboratory, ECE Department,
University of California at Davis, August 1979.
40. J. O. Limb and J. A. Murphy. "Measuring the Speed of Moving Objects from Television
Signals," IEEE Trans. Commun. COM-23 (April 1975): 474-478.
41. C. Cafforio and F. Rocca. "Methods for Measuring Small Displacements of Television
Images." IEEE Trans. Inform. Theory IT-22 (September 1976): 573-579.
Section 11.8
For the optimal mean square encoding and decoding results and their application,
we follow [6, 39] and:
42. G. A. Wolf. "The Optimum Mean Square Estimate for Decoding Binary Block Codes."
Ph.D. Thesis, University of Wisconsin at Madison, 1973. Also see G. A. Wolf and
R. Redinbo. fREE Trans. Inform. Theory IT·20 (May 1974): 344-351.
,
For extended bibliography, see [6, 8].
•
Section 11.9
[If] is devoted to coding of two tone images. Details of CeITT standards and
various algorithms are available here. Some other useful references are Arps (pp,
222-276) in [4], Huang in [1c, 2J, Musmann and Preuss in [Ic], and:
"
Section 11.10
Color and multispectral coding techniques discussed here have been discussed by
Limb et al. in [Ic), Pratt in rib], and:
Section 1'1.11
Several techniques not discussed in this chapter include nonuniform sampling tech-
niques combined with interpolation (such as using splines), use of singular value
decompositions, autoregressive (AR) model synthesis, and the like, Summary dis-
cussion and the relevant sources of these and other useful methods are given in
[6, 8]. • •