Image Processing Analysis and and Machine Vision 3
Image Processing Analysis and and Machine Vision 3
net/publication/220695728
CITATIONS READS
61 27,696
3 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Roger David Boyle on 13 May 2015.
Second Edition
1D one dimension(al)
2D two dimension(al)
3D three dimension(al)
AI articial intelligence
ASM active shape model
B-rep boundary representation
CAD computer-aided design
CCD charge-coupled device
CSG constructive solid geometry
CT computer tomography
dof degrees of freedom
ECG electro-cardiogram
EEG electro-encephalogram
FFT fast Fourier transform
FOE focus of expansion
GA genetic algorithm
HMM hidden Markov model
IHS intensity, hue, saturation
JPEG Joint Photographic Experts Group
MR magnetic resonance
MRI magnetic resonance imaging
OCR optical character recognition
OS order statistics
PDM point distribution model
PET positron emission tomography
PMF Pollard-Mayhew-Frisby (correspondence algorithm)
RGB red, green, blue
SNR signal-to-noise ratio
SVD singular value decomposition
TV television
Preface
Image processing, analysis and machine vision represent an exciting and dynamic
part of cognitive and computer science. Following an explosion of interest during
the 1970s, the 1980s and 1990s were characterized by the maturing of the eld
and the signicant growth of active applications; remote sensing, technical diagnos-
tics, autonomous vehicle guidance, medical imaging (2D and 3D) and automatic
surveillance are the most rapidly developing areas. This progress can be seen in
an increasing number of software and hardware products on the market, as well
as in a number of digital image processing and machine vision courses oered at
universities worldwide.
There are many texts available in the areas we cover|most (indeed, all of which
we know) are referenced somewhere in this book. The subject suers, however, from
a shortage of texts which are `complete' in the sense that they are accessible to the
novice, of use to the educated, and up to date. Here we present the second edition
of a text rst published in 1993 in which we hope to include many of the very rapid
developments that have taken and are still taking place, which quickly age some
of the very good textbooks produced over the last two decades or so. The target
audience is the undergraduate with negligible experience in the area through to
the Master's and research student seeking an advanced springboard in a particular
topic. Every section of this text has been updated since the rst version (partic-
ularly with respect to references); additionally, wholly new sections are presented
on: compression via JPEG and MPEG; fractals; fuzzy logic recognition; hidden
Markov models; Kalman lters; point distribution models; three-dimensional vision;
watershed segmentation; wavelets; and an entire chapter devoted to case studies.
Additionally, each chapter now includes a concise Summary section. To help the
reader to acquire practical understanding, newly added Exercise sections accom-
pany each chapter; these are in the form of short-answer questions and problems
of varying diculty, frequently requiring practical usage of computer tools and/or
development of application programs.
This book re
ects the authors' experience in teaching one- and two-semester
undergraduate and graduate courses in Digital Image Processing, Digital Image
Analysis, Machine Vision, Pattern Recognition, and Intelligent Robotics at their
respective institutions. We hope that this combined experience will give a thorough
grounding to the beginner and provide material that is advanced enough to allow
the more mature student to understand fully the relevant areas of the subject. We
acknowledge that in a very short time the more active areas will have moved beyond
this text.
xix
xx Preface
This book could have been arranged in many ways. It begins with low-level pro-
cessing and works its way up to higher levels of image interpretation; the authors
have chosen this framework because they believe that image understanding origi-
nates from a common database of information. The book is formally divided into
16 chapters, beginning with low-level processing and working toward higher-level
image representation, although this structure will be less apparent after Chapter 10,
when we present transforms, compression, morphology, texture, and motion analy-
sis which are very useful but often special-purpose approaches that may not always
be included in the processing chain. The nal chapter presents four live research
projects which illustrate in practical use much of what has gone before.
Decimal section numbering is used, and equations and gures are numbered
within each chapter. Each chapter is accompanied by an extensive list of references
and exercises. A selection of algorithms is summarized formally in a manner that
should aid implementation|not all the algorithms discussed are presented in this
way (this might have doubled the length of the book); we have chosen what we
regard as the key, or most useful or illustrative, examples for this treatment.
Each chapter presents material from an introductory level through to an overview
of current work; as such, it is unlikely that the beginner will, at the rst reading,
expect to absorb all of a given topic. Often it has been necessary to make reference
to material in later chapters and sections, but when this is done an understanding
of material in hand will not depend on an understanding of that which comes later.
It is expected that the more advanced student will use the book as a reference text
and signpost to current activity in the eld|we believe at the time of going to
press that the reference list is full in its indication of current directions, but record
here our apologies to any work we have overlooked. The serious reader will note
that many references are very recent, and should be aware that before long more
relevant work will have been published that is not listed here.
This is a long book and therefore contains material sucient for much more
than one course. Clearly, there are many ways of using it, but for guidance we
suggest an ordering that would generate four distinct modules:
Digital Image Processing, an undergraduate course.
Digital Image Analysis, an undergraduate/graduate course, for which Digital
Image Processing may be regarded as prerequisite.
Computer Vision I, an undergraduate/graduate course, for which Digital Image
Processing may be regarded as prerequisite.
Computer Vision II, a graduate course, for which Computer Vision I may be
regarded as prerequisite.
The important parts of a course, and necessary prerequisites, will naturally be
specied locally; a suggestion for partitioning the contents follows this Preface.
Assignments should wherever possible make use of existing software; it is our
experience that courses of this nature should not be seen as `programming courses',
but it is the case that the more direct practical experience the students have of
the material discussed, the better is their understanding. Since the rst edition was
Preface xxi
published, an explosion of World Wide Web-based material has been made available,
permitting many of the exercises we present to be conducted without the necessity
of implementing from scratch|we do not present explicit pointers to Web material,
since they evolve so quickly; however, pointers to specic support materials for this
book and others may be located via the publisher, http://www.brookscole.com.
The book has been prepared using the LATEX text processing system. Its comple-
tion would have been impossible without extensive usage of the Internet computer
network and electronic mail. We should like to acknowledge the University of Iowa,
the Czech Technical University, and the School of Computer Studies at Leeds Uni-
versity for providing the environment in which this book was prepared.
Milan Sonka was a faculty member of the Department of Control Engineer-
ing, Faculty of Electrical Engineering, Czech Technical University, Prague, Czech
Republic for ten years, and is now an Associate Professor at the Department of Elec-
trical and Computer Engineering, the University of Iowa, Iowa City, Iowa, USA. His
research interests include medical image analysis, knowledge-based image analysis,
and machine vision. Vaclav Hlavac is an Associate Professor at the Department
of Control Engineering, Czech Technical University, Prague. His research interests
are knowledge-based image analysis and 3D model-based vision. Roger Boyle is a
Senior Lecturer in Articial Intelligence in the School of Computer Studies at the
University of Leeds, England, where his research interests are in low-level vision and
pattern recognition. The rst two authors have worked together for some years, and
have been co-operating with the third since 1991.
The authors have spent many hours in discussions with their teachers, col-
leagues, and students, from which many improvements to early drafts of this text
resulted. Particular thanks are due to Tomas Pajdla, Petr Kodl, Radim Sara at the
Czech Technical University; Steve Collins at the University of Iowa; Jussi Parkki-
nen at the University of Lappeenranta; Guido Prause at the University of Bremen;
David Hogg at the University of Leeds; and many others whose omission from this
list does not diminish the value of their contribution. The continuous support and
encouragement we received from our wives and families, while inexplicable, was es-
sential to us throughout this project|once again, we promise that our next book
will not be written outside standard oce hours or during holidays (but this time
we mean it).
All authors have contributed throughout|the ordering on the cover corresponds
to the weight of individual contribution. Any errors of fact are the joint responsi-
bility of all, while any errors of typography are the responsibility of Roger Boyle.
Jointly, they will be glad to incorporate any corrections into future editions.
Milan Sonka ([email protected])
The University of Iowa, Iowa City, Iowa, USA
Vaclav Hlavac ([email protected])
Czech Technical University, Prague, Czech Republic
Roger Boyle ([email protected])
University of Leeds, Leeds, England
Course contents
In this section, one possible ordering of the material covered in the four courses
proposed in the Preface is given. This coverage should not be considered the only
possibility|on the contrary, the possibilities for organizing Image Processing and
Analysis courses are practically endless. Therefore, what follows should only be re-
garded as suggestions, and instructors should tailor course content to t the already
acquired knowledge, abilities, and needs of the students enrolled.
Digital Image Processing. An undergraduate course.
1 Introduction
2 The digitized image and its properties
3 Data structures for image analysis
4 Image pre-processing (excluding 4.3.6{4.3.9, 4.4.3, limited coverage of 4.3.4,
4.3.5)
5 Segmentation
5.1 Thresholding (excluding 5.1.3, 5.1.4)
5.2 Edge-based segmentation (excluding 5.2.8, limited coverage of
5.2.4, 5.2.5)
5.3 Region growing segmentation (excluding 5.3.4)
5.4 Matching
12 Linear discrete image transforms
13 Image data compression
16 Case studies (selected topics)
Digital Image Analysis. An undergraduate/graduate course, for which Digital
Image Processing may be regarded as prerequisite. Sections that were cov-
ered in the Digital Image Processing class and re-appear are intended to be
discussed at more depth than is possible in the introductory course.
1 Introduction (brief review)
2 The digitized image and its properties (brief review)
5 Segmentation
5.1.3 Multi-spectral thresholding
5.1.4 Thresholding in hierarchical data structures
xxiii
xxiv Course contents
5.2.4 Edge following as graph searching
5.2.5 Edge following as dynamic programming
5.3.4 Watershed segmentation
6 Shape representation and description (excluding 6.2.7, 6.3.4{6.3.6, 6.4)
7 Object recognition
7.1 Knowledge representation
7.2 Statistical pattern recognition
7.3 Neural networks
7.4 Syntactic pattern recognition
11 Mathematical morphology
14 Texture
16 Case studies (selected topics)
Computer Vision I. An undergraduate/graduate course, for which Digital Image
Processing may be regarded as prerequisite.
1 Introduction (brief review)
2 The digitized image and its properties (brief review)
4 Image pre-processing
4.3.3 Zero-crossings of the second derivative
4.3.4 Scale in image processing
4.3.5 Canny edge detection
4.3.6 Parametric edge models
4.3.7 Edges in multi-spectral images
4.3.8 Other local pre-processing operators
4.3.9 Adaptive neighborhood pre-processing
6 Shape representation and description
7 Object recognition
8 Image understanding
16 Case studies (selected topics)
Computer Vision II. A graduate course, for which Computer Vision I may be
regarded as prerequisite.
5 Segmentation
5.2.4 Edge following as graph searching
5.2.5 Edge following as dynamic programming
5.5 Advanced border and surface detection approaches
9 3D Vision, geometry and radiometry
10 Use of 3D vision
15 Motion analysis
Practical 3D vision projects
Index
Bold text refers to major or dening back propagation, see neural nets
entries. back-projection, 538, 539
back-tracking, 160, 161, 320, 321, 326,
2.5D sketch, 445{446, 520 327, 373, 404, 538, 539
2D co-ordinate system, 510, 538 background, 29
2D projection, 231, 232, 521, 522, 541, ball, 577
544, 680 geodesic, 585
2D shape, 43, 228, 231 maximal, 577
3D co-ordinate system, 537 unit, 577
3D information, 228, 509
3D interpretation, 444, 510, 522 balloon, 378
3D model, 520 baseline, 458
3D object, 519 bay, 32
3D representation, 11, 445, 446, 520, Bayes formula, 300, 301, 391, 394, 395
521 BDRF, 491
3D shape, 43, 228, 232 bin-picking, 481, 541
blackboard, 372, 373
A-algorithm, 148{156, 160, 161 blocks world, 521
ACRONYM, 373, 527 blur, 745
active perception, 447 Gaussian, 33, 84, 90, 97
active sensor, 484 Boltzmann constant, 333
active shape model, see ASM border, 30, 34
active vision, 512 detection, 134, 335
acuity, 34 optimal, 148{163, 194
adaptive neighborhood, 98{102 simultaneous, 194, 731
ADEOS, 621 extended, 144{146
albedo, 492 inner, 30, 142
algorithm (in Marr's theory), 444 inter-pixel, 144
aliasing, 20 outer, 30, 142{145
anti-aliasing, 22 border detection, 722
anti-extensive transformation, 566 boundary, see border
arc (of a graph), 47
area, 45, 51, 237, 254{256, 260, 292, occlusion, 496
527, 560 boundary representations, see B-reps
area-based stereo, 480 brightness, 2, 3, 5, 10, 11, 12, 18, 22,
ASM, 387{390, 722 23, 27, 29, 32{34, 37, 42{44,
aspect, 545 46, 47, 52, 57{59, 61, 63, 72{
aspect graph, 545 74, 77, 94, 98{102, 124, 126{
autocorrelation, 627 128, 133, 134, 146, 147, 174,
176{178, 180, 181, 185, 259,
B-reps, 527 261, 394, 488, 495, 543, 621{
755
756 Index
623, 630, 632, 649, 653, 655, closing, 568{569
656, 659, 667, 682, 683 cluster analysis, 307, 308, 722
correction, 58 clustering, see cluster analysis
interpolation, 65{68 CMY, 25
transformation, 58{61 co-lineation, 449
brightness interpolation co-occurrence matrix, 44
bi-cubic, 67 co-ordinate system
linear, 67 2D, see 2D co-ordinate system
nearest neighbor, 66 3D, see 3D co-ordinate system
object-based, 11, 446, 520
calculus, 27, 77, 329, 559 polar, 235
camera, 10, 36, 43, 58, 103, 105, 250, rectangular, 235
253, 679, 680, 686 tangential, 235, 237
extrinsic parameters, 452 COBIUS, 372
intrinsic parameters, 452 code
self-calibration, 454 chain, see chain code
Canny edge detector, 80, 90{93 dictionary, 629, 633
center of gravity, 260, 272 Freeman, 45
chain, 45 leaf, 52
Markov, 659 run length, 46
chain code, 45, 146, 236{238, 244, coding
255 Human, 633
chamfer matching, see matching, cham- low and high frequencies, 632
fer region border, 632
chamfering, 27, 192, 193 color, 34
characteristic strip, 495 image, 23
characteristic view, 544 palette, 59, 633
chromaticity, 26 primary, 23
class, 297 secondary, 24
identier, 298 colorimetry, 12
classication compatibility function, 398, 410
contextual, 392{397, 403 compression, 3, 5, 621{637
recursive, 395 application
classier, 297{306, 319 asymmetric, 629, 636
, 300, 302 symmetric, 629, 636
best approximation, 302 dictionary-based, 633
learning, 302, 303{306, 321 DPCM, 627, 629
linear, 299 fractal, 632
maximum likelihood, 300 hierarchical, 630{632
minimum distance, 299, 306{308, hybrid, 621, 629, 630, 632
724 JPEG, 606, 612, 633{635
minimum error, 300{306 Lempel-Ziv, 633
non-linear, 300 MJPEG, 635, 637
setting, 300{303 MPEG, 612, 634, 636{637
syntactic, 319 predictive, 621, 624{629, 632
clique, 327, 328 progressive, 630{631
Index 757
pyramid, 630 correlation, 16, 191{194, 248, 621{
ratio, 623, 624, 627, 630 624
region growing, 632 correlation-based correspondence, 480
smart, 630, 631 correspondence, 63, 97, 509, 510, 680,
transform, 621, 623, 631, 632 681, 696, 697, 699, 704
vector quantization, 629, 632 problem, 97, 482, 509, 510
wavelet, 624 stereo, see stereo correspondence
computed tomography, 738 correspondence problem, 476
computer graphics, 11, 43, 62, 245, cost function, 148{156, 722, 731, 735
663 automated design, 163
condence, 395{413 cost transform, 154, 163
conic, 251, 252 crack edge, 31, 138, 139, 141, 144, 148,
constraint 178, 179
epipolar, see epipolar constraint criterion
propagation, 397{401, 405, 406, detection, 90
521 localization, 90
constructive solid geometry, 525 one response, 90
context, 123, 137{139, 364, 367, 392{ crossover, 331{333
397, 405, 406 CSG, see constructive solid geometry
contour CT imaging, 722
false, 23 curvature, 237
partitioning, 244 peak, 244
shape from, see shape from con- primal sketch, 245
tour curvature primal sketch, 542
contrast, 33, 34, 59, 60, 97, 99, 130, curve
135, 189, 543, 652, 654, 683 decomposition, 245
enhancement, 100{102 detection, 167, 169, 171, 173
control strategy, 291, 363{373, 399 granulometric, 589
bottom-up, 157, 290, 365{366, 367, cyclopean image, 481
368, 371, 400 cyclopean separation, 482
combined, 367, 371
hierarchical, 364, 371 daemon, 373, 399
hypothesize and verify, see hypoth- data structure, 42
esize and verify hierarchical, 49
model-based, 366{368 relational, 48
non-hierarchical, 371, 372 traditional, 43
parallel, 364, 366, 371 de-centering, 456
serial, 364 de-fuzzication, 339, 342
top-down, 365{368, 371 composite maximum, 342
convex hull, 31, 559 composite moments, 342
convolution, 13, 192, 600, 602, 604, decimation, 530
608, 609 decision rule, 298, 300, 304
mask, 69 decit of convexity, 32
theorem, 14 deformable objects, 520
core, 269 degradation, 105
corner, 97 atmospheric turbulence, 105
758 Index
relative motion of the camera and in multi-spectral image, 94
object, 105 Kirsch, 83
wrong lens focus, 105 Laplace, 78, 81
depth, 11, 515 Marr-Hildreth, 83
depth map, 445, 484, 514 parametric, 80, 93
diagram Prewitt, 81
Voronoi, 404 Roberts, 80
dierence image, see image, dierence Robinson, 83
dierence of Gaussians, 86 Sobel, 82
dilation, 563{565, 745 zero-crossing, 83
conditional, 592 EGI, 543
geodesic, 586 ego-motion, 467
gray-scale, 569{574 elastics, 367
Dirac distribution, 13 entropy, 15, 621{623, 654
discrete topology, 30 epipolar constraint, 459, 477, 483
discrimination function, 298{302 epipolar line, 458, 483
disparity, 459, 482 epipolar plane, 458
gradient, 481, 482 epipolar transfer, 476
gradient limit, 481 epipole, 458
distance, 27, 192 erosion, 565{567, 745
chessboard (D8 ), 27 geodesic, 586
city block (D4 ), 27 gray-scale, 569{574
Euclidean (DE ), 27 ultimate, 582
geodesic, 585 errors
Levenshtein, 328 matching, 535
distance function, 584 essential matrix, 462
DoG, see dierence of Gaussians Euler-Poincare characteristic, 256, 560
duality (morphological), 561 evaluated graph, 47
dynamic programming, 158{161, 730 event (aspect), 545
live lane, 163 exemplar, 299, 306{308, 313, 314
live wire, 162 expansion
isotropic, 563
Ebbinghaus illusion, 34 extended boundary, see border, extended
edge, 3, 4, 30, 445 extended Gaussian image, 543
chain, 134, 158 extrinsic parameters, 452
crack, 31
detector, 335 facet, 93
direction, 77 feature, 292
magnitude, 77 discriminativity, 303
relaxation, 137{142, 156, 161 informativity, 303
edge detection, 722 space, 297
edge detector, 77{88, 445, 537, 730, vector, 292
734 feature synthesis, 91
Canny, 90{93, 655 feature-based correspondence, 481
compass, 81 feedback, 3, 135, 631
facet model, 93 ll, 563
Index 759
lter, 57{107, 600{613 logic, 294, 336{344, 743{744
Gable, 691 membership function, 336
Gabor, 691 maximum normal form, 337
Gaussian, 445 minimum normal form, 337
median, 74{76 reasoning
ltering, 68 monotonic, 340
band-pass, 609{611 set, 336
high-pass, 609, 611 hedge, 337
inverse, 106 space, 336
Kalman, 105 system, 336{344
low-pass, 609, 611 model, 339
Wiener, 106 union, 339
tness, 409
focal point, 449 ganglion cell, 88
focus Gaussian blur, see Gaussian, blur
shape from, see shape from focus Gaussian lter, 84, 86, 445
forward algorithm, 420 generalized cones, see generalized cylin-
Fourier descriptor, see shape descrip- ders
tion, Fourier generalized cylinders, 526
Fourier transform, see transform, Fourier genetic algorithm, 330{333, 344, 409{
fractal, 248, 661 416
dimension, 237, 657 genus, 256
frame, 295, 296 geodesic transformation, 585
free-form surface, 519 Geographical Information Systems, 52
Freeman code, 45 geometric signals, 483
frequency geometric transformation, 2, 62{68, 722
spatial, 14 geon, 545
function, 10 Gestaltist theory, 509
autocorrelation, 16 GIF, 633
autocovariance, 16 GIS, 52
cross correlation, 16 Golay alphabet, 579
cross covariance, 16 gradient descent, 310
Dirac, 13 gradient operator, 68, 77{88
distance, 584 approximated by dierences, 79
distribution, 15 Kirsch, 83
point spread, 17 Laplace, 81
quench (morphology), 581 Prewitt, 81
fundamental matrix, 460 Roberts, 80
fuzzy Robinson, 83
complement, 339 Sobel, 82
composition, 339 gradient space, 490
min{max, 340 grammar, 292, 316{322, 660{664
correlation context-free, 318
minimum, 340 context-sensitive, 318
product, 341 fuzzy, 319
intersection, 339 general, 318
760 Index
inference, 316, 321{323 Viterbi algorithm, 420, 423
non-deterministic, 318 histogram, 32, 123, 127{129, 131, 178,
regular, 318 622, 623
stochastic, 319 bi-modal, 127{129
granulometry (morphological), 589 cumulative, 60
graph, 47, 144, 148, 158, 194, 254, equalization, 25, 60{61
267, 292, 293, 316, 320, 323{ modication, 100
328 multi-dimensional, 132
arc, 47 multi-modal, 128
assignment, 327, 328 smoothed, 131
evaluated, 47, 144, 295, 323, 324 transformation, 128
isomorphism, 323{328 hit-or-miss transformation, 568
neighborhood, 272 HMM, see hidden Markov model
node, 47 hole, 29
region, 267, 270 homogeneity, 176, 177, 181{190
region adjacency, 47, 53, 124, 180, homogeneous co-ordinates, 448
182, 272, 401, 405, 406, 409{ homotopic substitute (of skeleton), 579
412, 742 homotopic transformation, 576
search, 148{156, 161, 367, 368, Hopeld networks, see neural nets
730, 734 horizon, 516
advanced approaches, 194 Hough transform, see transform, Hough
heuristic, 151{157, 161 HSI, 25, 34
three-dimensional, 194, 731 hue, 25
similarity, 323, 328 human visual system, 445
graph matching, 536 hypothesis, 362, 366, 409{416
graph search, 722 hypothesize and verify, 244, 273, 366,
gray-level, see brightness 367, 409, 536
gray-scale transformation, 59 hypothesize-and-verify, 538
grid, 22 hysteresis, 91, 92, 136
hexagonal, 22
square, 22 ICP algorithm, 533
group, 250, 252 IHS, see HSI
Lie, 250 illumination, 488
plane-projective, 250 image, 10
grow, 563 binary, 23, 44
co-ordinates, 12
HEARSAY, 373 color, 23
heuristic, 5, 151, 152, 156, 161, 177{ compression, see compression, see
179, 190, 405, 406, 416 compression
hidden Markov model, 417{423 cyclopean, 481
Baum-Welch algorithm, 422, 423 dierence, 682{684, 745
decoding, 418, 420{422 digitization, 18{26
evaluation, 418{420 dynamic, 12
forward algorithm, 420 enhancement, 57
Forward-Backward algorithm, 422 iconic, 42
learning, 418, 422 intensity, see intensity image
Index 761
interpretation, 363{417, 722 IYQ, see YIQ
multi-spectral, 23
pre-processing, 57{107 Kalman lter, 105, 708{710, 722, 747{
quality, 35 749
reconstruction, 621{624, 631 Kalman gain matrix, 709
restoration, see restoration knowledge, 3, 5, 291{296, 330, 333
scale-space, 89, 245 a priori, 6, 135, 148, 164, 173, 175,
segmented, 43 182, 230, 368, 373, 391, 406,
sharpening, 79 409, 413, 735, 740
skew, 62, 64, 65, 722, 723 base, 291, 293, 294
smoothing, see smoothing procedural, 294
static, 12 representation, 291{296, 363
transform, 600{613 Kohonen feature maps, see neural nets
understanding, 5, 362{417 Kohonen networks, see neural nets
image irradiance equation, 493 label, 232{235, 373, 391{404, 406, 407,
image plane, 449 410{412, 740{742
image rectication, 466 collision, 233{235
image sharpening, 3 labeling, 232, 233, 255, 373, 391, 393,
imaging 395, 396, 397{417, 722
ultrasound, 734 consistent, 397, 399
implementation (in Marr's theory), 444 discrete, 398, 404
impossible objects, 522 probabilistic, 397, 400
impulse semantic, 397
Dirac, 13 lacunarity, 657
limited, 20 Lagrange multipliers, 497, 687
increasing transformation, 564, 566 lake, 32
inference, 363 Lambertian surface, 492
intensity, see brightness, 25 landmarks, 274
intensity axis of symmetry, 269 Landsat, 621
intensity image, 11, 57, 83, 366, 496, language, 316{322
537, 542, 543, 696 Laplacian, 78, 445
interest point, 97 Laplacian of Gaussian, 84
interpretation learning, 299, 303{307, 311, 317, 322,
3D, see 3D interpretation 333
genetic, 408 from experience, 363
tree, 404, 536 unsupervised, 307
interval tree, 89, 245 LIDAR, 484
intrinsic parameters, 452 light, 58, 88
invariants, 231, 249{252, 542 source, 11, 495
scalar, 250 line
inverse ltering, 106 detector, 537, 539
inverse transformation, 568 nding, 94
irradiance, 488 labeling, 400, 521{523, 536
irradiance equation, 493 thinning, 96
ISODATA, 308, 391 linear system, 17
isotropic expansion, 563 linguistic
762 Index
variable, 294, 337, 340 short-term, 373
live lane, 163 Mexican hat, 85
live wire, 162 Minkowski algebra, 563
local pre-processing, 68{102 model, 123, 155, 156, 172, 176, 362,
local shading analysis, 497 363, 365{373, 408
locus 3D, 520
visibility, 538{540 active contour, 174, 367, 374{380,
logic 681
fuzzy base, 536
training, 743 deformable, 367, 374{380
luminance, 25 facet, 93, 95, 97
luminous ecacy, 487 hidden Markov, see hidden Markov
luminous
ux, 487 model
LZW, see Lempel-Ziv-Welch Markov, see Markov model
partial, 535
magnetic resonance, 738 quadric surface, 529
map surface, 520
region, 47 volumetric, 520, 523
marker, 591 model-based vision, 535
Markov chain, 659 modes of variation, 383
Markov model, 417, 423 Moire interferometry, 486
Marr (David), 5, 444 moment
Marr paradigm, see Marr's theory invariant, 260
Marr's theory, 83, 366, 444{446, 520 ane, 261
Marr-Hildreth edge detector, 80, 83, Moravec detector, 97
90, 739 morphological noise reduction, 722
matching, 190{194, 328, 330, 363 morphological transformation, 561
chamfer, 27, 192 quantitative, 562
errors, 535 morphology, 559{595, 659
graphs, 191, 323 motion, 508, 679{708
relational structures, 321 analysis, 679{708
sub-graphs, 328 correspondence of interest points,
mathematical morphology, 268 680, 681, 696{704
matrix, 43 dierential, 681{684
camera calibration, 451 assumptions, 681, 700, 704
co-occurrence, 44 continuous, 512
essential, 462 correspondence of interest points,
fundamental, 460 705
projective, 453 cyclic, 705
maximal ball, 577 description length, 702
MDL, 531 events, 705
medial axis, 271 features, 705
medial axis transform, 268 eld, 680
median lter, see lter, median gesture interpretation, 705
memory lipreading, 705
long-term, 373 object tracking, 700{708
Index 763
path coherence, 700 description, 297
deviation, 700 formal, 297
function, 700, 701 qualitative, 297, 315
recognition, 705 quantitative, 297, 315
relative, 705 relational, 315
rotational, 510, 513, 686 identication, 232{235
shape from, see shape from mo- impossible, 522
tion labeling, 232, 233, 255
trajectory recognition, 290{335
parametrization, 705 reconstruction, 228, 229
translational, 510, 513, 693, 694 objective function, 402, 403, 407{410
verb recognition, 705 occlusion, 229, 232, 242, 244, 253, 273,
multi-view representation, 544 522, 681, 749, 750
mutation, 331{333 occupancy grid, 523
Necker cube, 444 OCR, 4, 62, 228, 418, 420, 423, 722,
neighbor, 28 724
neural nets, 308{315, 344 octrees, 52
adaptive resonance theory, 313 opening, 568{569
back-propagation, 310{311 operator
epoch, 311 morphological, 562
feed-forward nets, 310{311 Zadeh, 339
gradient descent, 310 optical axis, 449
Hopeld, 313{315 optical center, 449
Kohonen networks, 312{313 optical character recognition, see OCR
momentum, 311 optical
ow, 497, 512, 513, 680, 685{
perceptron, 309 696, 699
transfer function, 309 computation, 680, 681
unsupervised learning, 312{313 eld, 512
node (of a graph), 47 global and local estimation, 689
noise, 3, 4, 35, 229, 236, 239, 255, 260, optimization, 158, 159, 193, 303, 306,
262, 269, 392 313, 328{335, 386
additive, 36 orthographic
Gaussian, 35, 708 projection, 11, 510, 512
impulsive, 37 view, 510
multiplicative, 37
quantization, 37 palette, see look-up table, 24, 59, 633
salt-and-pepper, 37 parallel implementation, 5, 44, 49, 141,
suppression, 99 142, 146, 162, 173, 364, 397,
white, 35, 708 399, 403, 416, 417, 483, 564,
non-maximal suppression, 91, 96, 135 688, 690
NURBS, 527 path, 29
simple, 146
object pattern, 297
coloring, 232 space, 297
connected component labeling, 148, vector, 297
232 pattern recognition, 290{323
764 Index
statistical, 292{308, 315 principal components analysis, 383
syntactic, 315{323 probability
PDM, 380{390, 722, 746 density, 304
alignment, 381 estimation, 304
covariance matrix, 383, 384 production
eigen-decomposition, 383 rules, 293
landmark, 381, 382{385, 387{389 system, 293, 294, 363, 373
modes of variation, 383 projection, 256, 510, 512, 538, 560,
polar, 390 608
polynomial regression, 390 2D, see 2D projection
perception, 33{35, 363, 660 histogram, 256, 722, 724
color, 34 orthographic, 11, 510, 512
human, 22, 33, 100, 515 parallel, 11
visual, 33 perspective, 11, 512
perceptron, see neural nets projective matrix, 453
perimeter, 143, 178, 179, 232, 237, 560 projective transformation, 449
perspective PROLOG, 293
projection, 11, 512, 542 pseudo-color, 61
perspective projection, 448 purposive vision, 448
photometric stereo, 498 pyramid, 49, 133, 134, 182{184, 364,
photometry, 12, 487 630, 632
picture element, 22 equivalent window, 53
pigment, 25 irregular, 53
pixel, 22 Laplacian, 53
adjacency, 28 M-pyramid, 49
pixel co-ordinate transformation, 63 matrix, 49
planning, 363 reduction factor, 52
plausibility (of a match), 539, 541 reduction window, 52
point regular, 52
representative, 560 T-pyramid, 49
sampling, 22 tree, 49
sets (morphological), 560 quadric surface model, 529
point distribution model, see PDM quadrilinear constraint, 473
post-processing, 392, 393, 408, 416 quadtree, 51, 182, 189, 235, 237, 255,
power spectrum, 17 272, 630
pre-processing, 57{107, 365, 393 qualitative vision, 447
adaptive neighborhood, 98{102 quantization, 22
classication, 57 quench function, 581
edge detector, see edge detector
local, 68{102 RADAR, 484
predicate logic, 293, 294 radial distortion, 456
primal sketch, 445{446, 520 radiance, 488
curvature, 245 radiant
ux, 487
primitive radiometry, 486
texture, 515 random dot stereograms, 480
volumetric, 446 range image, 484, 529
Index 765
receptive eld, 88 unique, 520
reconstruction (morphological), 584, 586 reproduction, 331, 332
rectication, 466 resolution
reduce, 565 radiometric, 12
redundancy spatial, 5, 12
information, 621{624 spectral, 12
reference view, 546 time, 12
re
ectance, 11, 12, 490 restoration, 102{107
re
ectance coecient, 492 deterministic, 103
re
ectance function, 492, 495 geometric mean ltration, 107
re
ectance map, 492 inverse ltering, 106
region, 28, 30 power spectrum equalization, 107
concavity tree, 266 stochastic, 103
decomposition, 254, 271{272 Wiener ltering, 106
identication, 232{235 RGB, 23
skeleton, 248, 254, 267{270, 272 rigidity, 510
region adjacency graph, 47, 53 rigidity constraint, 510
region growing, see segmentation, re- rim, 518
gion growing rotating mask, 73
region map, 47 rotational movement, 510, 513, 686
regional extreme, 582 run length coding, 46, 234, 272, 630
registration, 529
relation SAI, 543
neighborhood, 44 sampling, 18{22
spatial, 43 interval, 19
relational structure, 294, 315, 316, 321, point, 22
322, 363 saturation, 25
relaxation, 128, 138, 139, 141, 189, scale, 88{93, 229, 445, 646{648, 654,
398{404, 407, 417, 482, 497, 660
687, 689, 690 scale-space, 88, 89, 229, 244, 245, 269,
discrete, 398{400 274, 480
probabilistic, 400, 403 scene reconstruction, 471
reliability (of a match), 539 script, 295
remote sensing, 62, 65, 131, 132, 393, seed pixel, 98
397, 406, 621, 623, 705 segmentation, 4, 123{194, 365
representation, 444 border detection, 148{156, 173, 727{
3D, 446 737
complete, 520 simultaneous, 731
geometric, 43 border tracing, 142{147, 161
iconic image, 42 extended, 144{146
intermediate, 11, 42 inner, 142
level of, 42 outer, 142{145
multi-view, 544 classication-based, 390{393
relational model, 43 complete, 4, 123, 124, 174
segmented image, 43 dynamic programming, 158{161
skeleton, 527 edge thresholding, 135
766 Index
edge-based, 123, 134{175, 176, sequential thinning, see thinning, se-
188 quential
global, 123, 124 set dierence, 560
Hough transform, 162, 163{173 shading, 494
generalized, 165, 171, 172 shape from, see shape from shad-
match-based, 190{194 ing
morphological, 590 Shannon sampling theorem, 20
multi-thresholding, 128 shape, 228{273, 519
partial, 4, 123, 134, 135, 174 2D, see 2D shape
region construction 3D, see 3D shape
from borders, 174 class, 229, 273
from partial borders, 175 description
region growing, 123, 176{186, 188, area, 237, 241, 254{256, 257,
189, 404, 406, 408, 416, 722, 258, 260, 271
739 bending energy, 237
color image, 176 border length, 237{238
merging, 144, 177{181 chord distribution, 239
over-growing, 188 compactness, 144, 254, 259, 292,
semantic, 405, 406{408 412
split-and-merge, 181, 182 contour-based, 229, 232, 235{
splitting, 177, 178, 181 253
under-growing, 188 convex hull, 262, 266
region-based, 123, 146, 176{186 cross ratio, 250, 251
region-growing, 99 curvature, 232, 237, 242{248,
semantic, 404 271
direction, 254, 258
region growing, 406{408 eccentricity, 256
semi-thresholding, 126 elongatedness, 229, 254, 257, 258,
superslice, 174 269
texture, 335 Euler's number, 256
thresholding, 124{134, 174, 175, external, 229, 232
180, 181, 739 Fourier descriptors, 238, 240{
hierarchical, 134 242
minimum error, 129 graph, 254, 267
multi-spectral, 131 height, 256
p-tile, 127, 135 internal, 229, 232
tree, 182{184 invariants, 249{252
watersheds, 186, 408 moments, 241, 248, 258, 259{
self-calibrtion, 454 262, 272
self-occlusion, 476 moments, area-based, 262
semantic net, 294{295, 363, 397 moments, contour-based, 261, 262
semantics, 291, 294, 400, 405, 408, 416, perimeter, 143, 178, 179, 232,
722 237
sequential matching, 538 polygonal, 242{244, 245, 271
sequential thickening, see thickening, projection-invariant, 231, 249
sequential projections, 256
Index 767
rectangularity, 254, 258 slant, 515
region concavity tree, 266 smart snake, 387
region-based, 229, 232, 248, 254{ smoothing, 69{77
273 averaging, 69
run length, 234, 272 averaging according to inverse gra-
segment sequence, 239, 242 dient, 72
signature, 239 averaging with limited data valid-
sphericity, 258 ity, 71
spline, 245{248, 722, 745, 746 edge preserving, 69
statistical, 229, 254 Gaussian, 244
syntactic, 228, 229, 243, 272 Gaussianr, 244
width, 256 median, see lter, median
shape from non-linear mean lter, 77
contour, 518 order statistics, 76
de-focus, 517 rank ltering, 76
focus, 517 rotating mask, 73
motion, 508{515 snake, 174, 367, 374{380, 681
optical
ow, 512 growing, 377
shading, 494{497, 516 SNR, 36
stereo, 476{483 spatial angle, 488
texture, 515{517 spectral density, 17
vergence, 517 spectrum, 17
X, 445 band-limited, 20
shape from X, 508 frequency, 589, 603, 611
shape primitive, 271 granulometric, 589
sharpening, 79 phase, 603
shrink, 565 power, 603
sieving analysis, 589 spline, see shape, description, spline
sifting, 13 state space search, 178
signal-to-noise ratio, see SNR stereo
silhouette, 518 shape from, see shape from stereo
simplex angle image, 543 stereo correspondence, 335, 509
simulated annealing, 333{335 stereo vision, 457, 458, 509
singular point, 496 stereopsis, 457
singular value decomposition (SVD), stochastic process, 15{17
456 ergodic, 17
skeleton, 248, 254, 267{270, 272, 527, stationary, 16
559, 576{579 uncorrelated, 16
by in
uence zones, 585 structure from motion theorem, 510
by maximal balls, 577 structured light, 484
sketch structuring element (morphological),
2.5D, see 2.5D sketch 560
primal, see primal sketch super-quadrics, 525
skew, 62, 64, 65, 722, 723 supergrid, 144, 178, 179
skewing, see skew surface
SKIZ, skeleton by in
uence zones, 585 detection, 194
768 Index
free-form, 519 texture transform, 659
surface features, 542 element, 646
surface models, see B-reps ne, 646, 648{651, 655, 659
surface re
ectance, 490 generation, 661, 662
surveillance, 744 gradient, 515, 516
SVD, singular value decomposition, 456 hierarchical, 664
sweep representations, see generalized primitive, 175, 515, 646, 648, 649,
cylinders, 526 660, 661, 663, 665, 667
symbol segmentation, 654, 659, 660, 664{
non-terminal, 317 666
terminal, 317 shape from, see shape from tex-
symmetric axis transform, 268 ture
syntactic analysis, 316{322, 660, 662, strong, 648, 666
663, 722 structure, 646, 648
syntax, 291 tone, 646, 648, 667
system approach, 88 weak, 648, 666
system theory, 443 theory
computational, 444
texel, 515, 646 Gestaltist, see Gestaltist theory
texton, 649, 654 Marr's, see Marr's theory
texture, 123, 127, 134, 177, 335, 390, thickening, 559, 578{579
393, 397, 515, 646{668, 705 sequential, 579
Brodatz, 651 thinning, 254, 267, 268, 559, 578{579
coarse, 646, 648{651, 655, 659 sequential, 579
description three dimensions, see 3D
autocorrelation, 649, 660 threshold, 124
autoregression model, 659 optimal, 129
chain grammar, 661 selection, 124, 127, 129
co-occurrence, 651, 653, 660 mode method, 128
discrete transform, 650 optimal, 128
edge frequency, 653 thresholding, 59
fractal, 657, 661 adaptive, 125
grammar, 660 optimal, 128
graph grammar, 663, 664, 666 with hysteresis, 91
hybrid, 649, 660, 666{667 TIFF, 633
Laws' energy measures, 656 tilt, 515
morphology, 659 tolerance interval, 242
optical transform, 650 top surface (morphological), 570
peak and valley, 659 top-down approach, 319, 321, 365, 366,
primitive grouping, 664 538
primitive length, 655 topographic characterization, 543
run length, 655 topographic primal sketch, 543
shape rules, 661 topology, 31
statistical, 648{660, 664, 666 training set, 302{303, 304, 306, 307,
syntactic, 648, 660{666 310, 311, 322, 323, 380, 391,
texture properties, 654 395
Index 769
transform hat, 67
binary, 624 homotopic, 576
cosine, 605, 624, 632 increasing, 564, 566
distance, 269 inverse, 568
Fourier, 12{20, 30, 68, 79, 105{ morphological, see morphological
107, 192, 229, 240, 241, 423, transformation, 561
600, 602, 604, 606, 608{611, pixel brightness, 58{61
613, 624, 650 projective, 449
inverse, 14 top hat, 574
Gabor, 660 translation (morphological), 561
Haar, 608 translational movement, 510, 513, 693,
Hadamard, 604{605, 608, 624, 632 694
Hadamard-Haar, 608 transmission
hat, 85 progressive, 630, 631
Hough, 162, 163{173, 248, 516, smart, 631, 632
608 tree
image, 600{613 interval, 245
Karhunen-Loeve, 608, 623, 624 tree pruning, 154, 320
linear discrete, 600{613 trilinear constraint, 475
orthogonal, 601, 605 two dimensions, see 2D
Paley, 608 two-dimensional, see 2D
Radon, 608 ultimate erosion, 582
recursive block coding, 624 ultrasound, 734
sine, 608 intra-vascular, 734
Slant-Haar, 608 umbra (morphological), 570
Walsh, 608, 624 unit ball, 577
Walsh-Hadamard, 605 unsharp masking, 79
wavelet, 660 unsupervised learning, 307
wavelets, 606, 611 upper semi-continuinty, 563
transformation
ane, 64 vanishing points, 516
anti-extensive, 566 velocity
bilinear, 64 eld, 681
brightness correction, 58 computation, 698
geodesic, 585 smoothness constraint, 686, 689,
geometric, 62{68 690
brightness interpolation, 65 vector, 680
change of scale, 64 vergence, 517
pixel co-ordinate, 63 shape from, see shape from ver-
rotation, 64 gence
skewing, 65 vertex, 521
gray-scale, 59{61 view
histogram equalization, 60{61 reference, 546
logarithmic, 61 topographic (morphology), 570
pseudo-color, 61 virtual, 546
thresholding, 59 viewing space, 544
770 Index
viewing sphere, 544
viewpoint, 11
vignetting, 490
virtual view, 546
visibility, 540
visibility classes, 549
visibility locus, 538{540
vision
active, 512
model-based, 535
stereo, 458, 509
view-based, 544
VISIONS, 372, 374
visual
potential, 545
visual system
human, 33, 445, 457, 510, 512, 631,
660
Viterbi algorithm, 420, 423
volume
partial, 131
volumetric model, 523
volumetric primitives, 446
voxel, 130, 523
watersheds, 592
wavelength, 23
wavelets, 606{607, 611, 660
Wiener ltering, 106
WINSOM, 525
X
shape from, see shape from X
YIQ, 25
Zadeh operators, 339
zero crossing, 445
zero-crossing, 80, 83, 90, 244, 245, 445,
480, 605, 654, 690
Zuniga{Haralick operator, 97, 696