Handbook of Computer Vision Algorithms in Image Algebra
Handbook of Computer Vision Algorithms in Image Algebra
Search Tips
Search this book:
Advanced Search
Preface
Acknowledgments
Title
Appendix A
Index
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Table of Contents
Title
Preface
The aim of this book is to acquaint engineers, scientists, and students with the basic concepts of image algebra
and its use in the concise representation of computer vision algorithms. In order to achieve this goal we
----------- provide a brief survey of commonly used computer vision algorithms that we believe represents a core of
knowledge that all computer vision practitioners should have. This survey is not meant to be an encyclopedic
summary of computer vision techniques as it is impossible to do justice to the scope and depth of the rapidly
expanding field of computer vision.
The arrangement of the book is such that it can serve as a reference for computer vision algorithm developers
in general as well as for algorithm developers using the image algebra C++ object library, iac++.1 The
techniques and algorithms presented in a given chapter follow a progression of increasing abstractness. Each
technique is introduced by way of a brief discussion of its purpose and methodology. Since the intent of this
text is to train the practitioner in formulating his algorithms and ideas in the succinct mathematical language
provided by image algebra, an effort has been made to provide the precise mathematical formulation of each
methodology. Thus, we suspect that practicing engineers and scientists will find this presentation somewhat
more practical and perhaps a bit less esoteric than those found in research publications or various textbooks
paraphrasing these publications.
1Theiac++ library supports the use of image algebra in the C++ programming language and is available for
anonymous ftp from ftp://ftp.cis.ufl.edu/pub/src/ia/.
Chapter 1 provides a short introduction to field of image algebra. Chapters 2-11 are devoted to particular
techniques commonly used in computer vision algorithm development, ranging from early processing
techniques to such higher level topics as image descriptors and artificial neural networks. Although the
chapters on techniques are most naturally studied in succession, they are not tightly interdependent and can be
studied according to the reader’s particular interest. In the Appendix we present iac++ computer programs
of some of the techniques surveyed in this book. These programs reflect the image algebra pseudocode
presented in the chapters and serve as examples of how image algebra pseudocode can be converted into
efficient computer programs.
Table of Contents
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Table of Contents
Title
Acknowledgments
We wish to take this opportunity to express our thanks to our current and former students who have, in
various ways, assisted in the preparation of this text. In particular, we wish to extend our appreciation to Dr.
----------- Paul Gader, Dr. Jennifer Davidson, Dr. Hongchi Shi, Ms. Brigitte Pracht, Mr. Mark Schmalz, Mr. Venugopal
Subramaniam, Mr. Mike Rowlee, Dr. Dong Li, Dr. Huixia Zhu, Ms. Chuanxue Wang, Mr. Jaime Zapata, and
Mr. Liang-Ming Chen. We are most deeply indebted to Dr. David Patching who assisted in the preparation of
the text and contributed to the material by developing examples that enhanced the algorithmic exposition.
Special thanks are due to Mr. Ralph Jackson, who skillfully implemented many of the algorithms herein, and
to Mr. Robert Forsman, the primary implementor of the iac++ library.
We also want to express our gratitude to the Air Force Wright Laboratory for their encouragement and
continuous support of image algebra research and development. This book would not have been written
without the vision and support provided by numerous scientists of the Wright Laboratory at Eglin Air Force
Base in Florida. These supporters include Dr. Lawrence Ankeney who started it all, Dr. Sam Lambert who
championed the image algebra project since its inception, Mr. Nell Urquhart our first program manager, Ms.
Karen Norris, and most especially Dr. Patrick Coffield who persuaded us to turn a technical report on
computer vision algorithms in image algebra into this book.
Last but not least we would like to thank Dr. Robert Lyjack of ERIM for his friendship and enthusiastic
support during the formative stages of image algebra.
Notation
The tables presented here provide a brief explantation of the notation used throughout this document. The
reader is referred to Ritter [1] for a comprehensive treatise covering the mathematics of image algebra.
Logic
Symbol Explanation
pÒq “p implies q.” If p is true, then q is true.
pÔq “p if and only if q,” which means that p and q are logically equivalent.
iff “if and only if”
¬ “not”
“there exists”
“there does not exist”
“for each”
s.t. “such that”
= X1 * X2 * ... * Xn
= {x : x Xi for some i }
X Y Intersection
X ) Y = {z : z X and z Y}
= X1 ) X2 ) ... ) Xn
= {x : x Xi for all i }
X×Y Cartesian product
X × Y {(x, y) : x X, y Y}
= {(x1,x2,...,xn) : xi Xi}
= {(x1,x2,x3,...) : xi Xi}
Morphology
In following table A, B, D, and E denote subsets of .
Symbol Explanation
A* The reflection of A across the origin 0 = (0, 0, ... 0) .
A2 The complement of A; i.e., A2 = {x : x A}.
Ab Ab = {a + b : a A}
A×B Minkowski addition is defined as A × B = {a + b : a A, b B}. (Section 7.2)
A/B Minkowski subtraction is defined as A/B = (A2 × B*)2. (Section 7.2)
A B The opening of A by B is denoted A B and is defined by A B = (A/B) × B.
(Section 7.3)
A"B The closing of A by B is denoted A " B and is defined by A " B = (A × B)/B. (Section
7.3)
A C Let C = (D, E) be an ordered pair of structuring elements. The hit-and-miss transform
of the set A is given by A C = {p : Dp 4 A and Ep 4 A2}. (Section 7.5)
The projection function pj onto the jth coordinate is defined by pj(x1,...,xj,...,xn) = xj.
ÇS(x)
The characteristic function ÇS is defined by .
is defined by .
Row concatenation of images a and b, respectively the row concatenation of images
(a|b), (a1|a2| ···, |an)
a1, a2,..., an.
f(a)
If a and f : ’ Y, then the image f(a) YX is given by f a, i.e., f(a) = {(x, c(x))
: c(x) = f(a(x)), x X}.
a f If f : Y ’ X and a , the induced image a f is defined by a f = {(y,
a(f(y))) : y Y}.
a³b
If ³ is a binary operation on , then an induced operation on can be defined. Let
a, b ; the induced operation is given by a ³ b = {(x, c(x)) : c(x) = a(x) ³ b(x), x
X}.
k³a
Let k ,a , and ³ be a binary operation on . An induced scalar operation on
images is defined by k ³ a = {(x, c(x)) : c(x) = k ³ a(x),x X}.
ab Let a, b ; ab = {(x, c(x)) : c(x) = a(x)b(x), x X}.
logba
Let a, b logba = {(x, c(x)) : c(x) = logb(x)a(x), x X}.
a* Pointwise complex conjugate of image a, a* (x) = (a(x))*.
“a “a denotes reduction by a generic reduce operation .
The following four items are specific examples of the global reduce operation. Each assumes a and X =
{x1, x2,..., xn}.
a"b
Dot product, a " b = £(a · b) = (a(x) · b(x)).
ã Complementation of a set-valued image a.
ac Complementation of a Boolean image a.
a2 Transpose of image a.
S (ty)
If t , then S (ty) = {x X : ty(x) ` }.
S- (ty)
If t , then S- (ty) = {x X : ty(x) ` -}.
S± (ty)
If t , then S± (ty) = {x X : ty(x) ` ±}.
t(p) A parameterized -valued template from Y to X with parameters in P is a function of
the form t : P ’ .
t2
Let t . The transpose t2 is defined as .
Image-Template Operations
In the table below, X is a finite subset of .
Symbol Explanation
a t
Let ( , ³, ) be a semiring and a ,t , then the generic right product
of a with t is defined as a .
t a
With the conditions above, except that now t , the generic left product of a
with t is defined as .
a t
Let Y 4 ,a , and t , where . The right linear
product (or convolution) is defined as
.
t a
With the conditions above, except that t , the left linear product (or
convolution) is defined as .
a t
For a and t , the right additive maximum is defined by
.
t a
For a and t , the left additive maximum is defined by
.
a t
For a and t , the right additive minimum is defined by
.
t a
For a and t , the left additive minimum is defined by
.
a t
For a and t , the right multiplicative maximum is defined by
.
t a
For a and t , the left multiplicative maximum is defined by
.
a t
For a and t , the right multiplicative minimum is defined by
.
t a
For a and t , the left multiplicative minimum is defined by
Image-Neighborhood Operations
In the table below, X is a finite subset of .
Symbol Explanation
a N Given a and N , and reduce operation , the generic
right reduction of a with N is defined as (a N)(x) = .
N a With the conditions above, except that now N , the generic left reduction of a
with t is defined as (N a)(x) = (a N2)(x).
a N Given a , and the image average function , yielding the
average of its image argument. (a N)(x) = a(a|N(x)).
a N Given a , and the image median function , yielding the
average of its image argument, (a N)(x) = m(a|N(x)).
References
1 G. Ritter, “Image algebra with applications.” Unpublished manuscript, available via anonymous ftp
from ftp://ftp.cis.ufl.edu/pub/src/ia/documents, 1994.
Dedication
To our brothers, Friedrich Karl and Scott Winfield
Table of Contents
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 1
Image Algebra
----------- 1.1. Introduction
Since the field of image algebra is a recent development it will be instructive to provide some background
information. In the broad sense, image algebra is a mathematical theory concerned with the transformation
and analysis of images. Although much emphasis is focused on the analysis and transformation of digital
images, the main goal is the establishment of a comprehensive and unifying theory of image transformations,
image analysis, and image understanding in the discrete as well as the continuous domain [1].
The idea of establishing a unifying theory for the various concepts and operations encountered in image and
signal processing is not new. Over thirty years ago, Unger proposed that many algorithms for image
processing and image analysis could be implemented in parallel using cellular array computers [2]. These
cellular array computers were inspired by the work of von Neumann in the 1950s [3, 4]. Realization of von
Neumann’s cellular array machines was made possible with the advent of VLSI technology. NASA’s
massively parallel processor or MPP and the CLIP series of computers developed by Duff and his colleagues
represent the classic embodiment of von Neumann’s original automaton [5, 6, 7, 8, 9]. A more general class
of cellular array computers are pyramids and Thinking Machines Corporation’s Connection Machines [10, 11,
12]. In an abstract sense, the various versions of Connection Machines are universal cellular automatons with
an additional mechanism added for non-local communication.
Many operations performed by these cellular array machines can be expressed in terms of simple elementary
operations. These elementary operations create a mathematical basis for the theoretical formalism capable of
expressing a large number of algorithms for image processing and analysis. In fact, a common thread among
designers of parallel image processing architectures is the belief that large classes of image transformations
can be described by a small set of standard rules that induce these architectures. This belief led to the creation
of mathematical formalisms that were used to aid in the design of special-purpose parallel architectures.
Matheron and Serra’s Texture Analyzer [13] ERIM’s (Environmental Research Institute of Michigan)
Cytocomputer [14, 15, 16], and Martin Marietta’s GAPP [17, 18, 19] are examples of this approach.
The formalism associated with these cellular architectures is that of pixel neighborhood arithmetic and
mathematical morphology. Mathematical morphology is the part of image processing concerned with image
filtering and analysis by structuring elements. It grew out of the early work of Minkowski and Hadwiger [20,
21, 22], and entered the modern era through the work of Matheron and Serra of the Ecole des Mines in
Fontainebleau, France [23, 24, 25, 26]. Matheron and Serra not only formulated the modern concepts of
morphological image transformations, but also designed and built the Texture Analyzer System. Since those
early days, morphological operations have been applied from low-level, to intermediate, to high-level vision
problems. Among some recent research papers on morphological image processing are Crimmins and Brown
[27], Haralick et al. [28, 29], Maragos and Schafer [30, 31, 32], Davidson [33, 34], Dougherty [35], Goutsias
[36, 37], and Koskinen and Astola [38].
Serra and Sternberg were the first to unify morphological concepts and methods into a coherent algebraic
theory specifically designed for image processing and image analysis. Sternberg was also the first to use the
term “image algebra” [39, 40]. In the mid 1980s, Maragos introduced a new theory unifying a large class of
linear and nonlinear systems under the theory of mathematical morphology [41]. More recently, Davidson
completed the mathematical foundation of mathematical morphology by formulating its embedding into the
lattice algebra known as Mini-Max algebra [42, 43]. However, despite these profound accomplishments,
morphological methods have some well-known limitations. For example, such fairly common image
processing techniques as feature extraction based on convolution, Fourier-like transformations, chain coding,
histogram equalization transforms, image rotation, and image registration and rectification are — with the
exception of a few simple cases — either extremely difficult or impossible to express in terms of
morphological operations. The failure of a morphologically based image algebra to express a fairly
straightforward U.S. government-furnished FLIR (forward-looking infrared) algorithm was demonstrated by
Miller of Perkin-Elmer [44].
The failure of an image algebra based solely on morphological operations to provide a universal image
processing algebra is due to its set-theoretic formulation, which rests on the Minkowski addition and
subtraction of sets [22]. These operations ignore the linear domain, transformations between different
domains (spaces of different sizes and dimensionality), and transformations between different value sets
(algebraic structures), e.g., sets consisting of real, complex, or vector valued numbers. The image algebra
discussed in this text includes these concepts and extends the morphological operations [1].
The development of image algebra grew out of a need, by the U.S. Air Force Systems Command, for a
common image-processing language. Defense contractors do not use a standardized, mathematically rigorous
and efficient structure that is specifically designed for image manipulation. Documentation by contractors of
algorithms for image processing and rationale underlying algorithm design is often accomplished via word
description or analogies that are extremely cumbersome and often ambiguous. The result of these ad hoc
approaches has been a proliferation of nonstandard notation and increased research and development cost. In
response to this chaotic situation, the Air Force Armament Laboratory (AFATL — now known as Wright
Laboratory MNGA) of the Air Force Systems Command, in conjunction with the Defense Advanced
Research Project Agency (DARPA — now known as the Advanced Research Project Agency or ARPA),
supported the early development of image algebra with the intent that the fully developed structure would
subsequently form the basis of a common image-processing language. The goal of AFATL was the
development of a complete, unified algebraic structure that provides a common mathematical environment for
image-processing algorithm development, optimization, comparison, coding, and performance evaluation.
The development of this structure proved highly successful, capable of fulfilling the tasks set forth by the
government, and is now commonly known as image algebra.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Because of the goals set by the government, the theory of image algebra provides for a language which, if
properly implemented as a standard image processing environment, can greatly reduce research and
development costs. Since the foundation of this language is purely mathematical and independent of any
future computer architecture or language, the longevity of an image algebra standard is assured. Furthermore,
savings due to commonality of language and increased productivity could dwarf any reasonable initial
----------- investment for adapting image algebra as a standard environment for image processing.
Although commonality of language and cost savings are two major reasons for considering image algebra as a
standard language for image processing, there exists a multitude of other reasons for desiring the broad
acceptance of image algebra as a component of all image processing development systems. Premier among
these is the predictable influence of an image algebra standard on future image processing technology. In this,
it can be compared to the influence on scientific reasoning and the advancement of science due to the
replacement of the myriad of different number systems (e.g., Roman, Syrian, Hebrew, Egyptian, Chinese,
etc.) by the now common Indo-Arabic notation. Additional benefits provided by the use of image algebra are
• The elemental image algebra operations are small in number, translucent, simple, and provide a
method of transforming images that is easily learned and used;
• Image algebra operations and operands provide the capability of expressing all image-to-image
transformations;
• Theorems governing image algebra make computer programs based on image algebra notation
amenable to both machine dependent and machine independent optimization techniques;
• The algebraic notation provides a deeper understanding of image manipulation operations due to
conciseness and brevity of code and is capable of suggesting new techniques;
• The notational adaptability to programming languages allows the substitution of extremely short and
concise image algebra expressions for equivalent blocks of code, and therefore increases programmer
productivity;
• Image algebra provides a rich mathematical structure that can be exploited to relate image processing
problems to other mathematical areas;
• Without image algebra, a programmer will never benefit from the bridge that exists between an image
algebra programming language and the multitude of mathematical structures, theorems, and identities
that are related to image algebra;
• There is no competing notation that adequately provides all these benefits.
The role of image algebra in computer vision and image processing tasks and theory should not be confused
with the government’s Ada programming language effort. The goal of the development of the Ada
programming language was to provide a single high-order language in which to implement embedded
systems. The special architectures being developed nowadays for image processing applications are not often
capable of directly executing Ada language programs, often due to support of parallel processing models not
accommodated by Ada’s tasking mechanism. Hence, most applications designed for such processors are still
written in special assembly or microcode languages. Image algebra, on the other hand, provides a level of
specification, directly derived from the underlying mathematics on which image processing is based and that
is compatible with both sequential and parallel architectures.
Enthusiasm for image algebra must be tempered by the knowledge that image algebra, like any other field of
mathematics, will never be a finished product but remain a continuously evolving mathematical theory
concerned with the unification of image processing and computer vision tasks. Much of the mathematics
associated with image algebra and its implication to computer vision remains largely unchartered territory
which awaits discovery. For example, very little work has been done in relating image algebra to computer
vision techniques which employ tools from such diverse areas as knowledge representation, graph theory, and
surface representation.
Several image algebra programming languages have been developed. These include image algebra Fortran
(IAF) [45], an image algebra Ada (IAA) translator [46], image algebra Connection Machine *Lisp [47, 48], an
image algebra language (IAL) implementation on transputers [49, 50], and an image algebra C++ class library
(iac++) [51, 52]. Unfortunately, there is often a tendency among engineers to confuse or equate these
languages with image algebra. An image algebra programming language is not image algebra, which is a
mathematical theory. An image algebra-based programming language typically implements a particular
subalgebra of the full image algebra. In addition, simplistic implementations can result in poor computational
performance. Restrictions and limitations in implementation are usually due to a combination of factors, the
most pertinent being development costs and hardware and software environment constraints. They are not
limitations of image algebra, and they should not be confused with the capability of image algebra as a
mathematical tool for image manipulation.
Image algebra is a heterogeneous or many-valued algebra in the sense of Birkhoff and Lipson [53, 1], with
multiple sets of operands and operators. Manipulation of images for purposes of image enhancement,
analysis, and understanding involves operations not only on images, but also on different types of values and
quantities associated with these images. Thus, the basic operands of image algebra are images and the values
and quantities associated with these images. Roughly speaking, an image consists of two things, a collection
of points and a set of values associated with these points. Images are therefore endowed with two types of
information, namely the spatial relationship of the points, and also some type of numeric or other descriptive
information associated with these points. Consequently, the field of image algebra bridges two broad
mathematical areas, the theory of point sets and the algebra of value sets, and investigates their
interrelationship. In the sections that follow we discuss point and value sets as well as images, templates, and
neighborhoods that characterize some of their interrelationships.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
There is no restriction on the shape of the discrete subsets of used in applications of image algebra to
solve vision problems. Point sets can assume arbitrary shapes. In particular, shapes can be rectangular,
circular, or snake-like. Some of the more pertinent point sets are the set of integer points (here we view
), the n-dimensional lattice
with n = 2 or n = 3,
and rectangular subsets of . Two of the most often encountered rectangular point sets are of form
or
We follow standard practice and represent these rectangular point sets by listing the points in matrix form.
Figure 1.2.1 provides a graphical representation of the point set .
Figure 1.2.1 The rectangular point set
Point Operations
As mentioned, some of the more pertinent point sets are discrete subsets of the vector space . These point
sets inherit the usual elementary vector space operations. Thus, for example, if and (or
), x = (x1, & , xn), y = (y1, & , yn) X, then the sum of the points x and y is defined as
while the multiplication and addition of a scalar (or ) and a point x is given by
k · x = (k · x1, & , k · xn)
and
k + x = (k + x1, & , k + xn),
and
x · y = x1 · y1 + x2 · y2 + & + xn · yn,
respectively.
Note that the sum of two points, the Hadamard product, and the cross product are binary operations that take
as input two points and produce another point. Therefore these operations can be viewed as mappings X × X ’
X whenever X is closed under these operations. In contrast, the binary operation of dot product is a scalar and
not another vector. This provides an example of a mapping , where denotes the
appropriate field of scalars. Another such mapping, associated with metric spaces, is the distance function
which assigns to each pair of points x and y the distance from x to y. The most common
distance functions occurring in image processing are the Euclidean distance, the city block or diamond
distance, and the chessboard distance which are defined by
and
´(x,y) = max{|xk - yk| : 1 d k d n},
respectively.
Distances can be conveniently computed in terms of the norm of a point. The three norms of interest here are
derived from the standard Lp norms
. Thus, d(x,y) = ||x - y||2. Similarly, the city block distance can be computed
using the formulation Á(x,y) = ||x - y||1 and the chessboard distance by using ´(x,y) = ||x - y||
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Note that the p-norm of a point x is a unary operation, namely a function . Another
assemblage of functions which play a major role in various applications are the projection
functions. Given , then the ith projection on X, where i {1, & , n}, is denoted by pi and defined by
-----------
pi(x) = xi, where xi denotes the ith coordinate of x.
Characteristic functions and neighborhood functions are two of the most frequently occurring unary
operations in image processing. In order to define these operations, we need to recall the notion of a power set
of a set. The power set of a set S is defined as the set of all subsets of S and is denoted by 2S. Thus, if Z is a
point set, then 2Z = {X : X 4 Z}.
Given X 2 Z (i.e., X 4 Z), then the characteristic function associated with X is the function
Çx : Z ’ {0, 1}
defined by
For a pair of point sets X and Z, a neighborhood system for X in Z, or equivalently, a neighborhood function
from X to Z, is a function
N : X ’ 2z.
It follows that for each point x X, N(x) 4 Z. The set N(x) is called a neighborhood for x.
There are two neighborhood functions on subsets of which are of particular importance in image
processing. These are the von Neumann neighborhood and the Moore neighborhood. The von Neumann
neighborhood is defined by
N(x) = {y : y = (x1 ± j, x2) or y = (x1, x2 ± k), j, k {0, 1}},
where , while the Moore neighborhood is defined by
M(x) = {y : y = x1 ± j, x2 ± k), j, k {0, 1}}.
Figure 1.2.2 provides a pictorial representation of these two neighborhood functions; the hashed center area
represents the point x and the adjacent cells represent the adjacent points. The von Neumann and Moore
neighborhoods are also called the four neighborhood and eight neighborhood, respectively. They are local
neighborhoods since they only include the directly adjacent points of a given point.
Figure 1.2.2 The von Neumann neighborhood N(x) and the Moore neighborhood M(x) of a point x.
There are many other point operations that are useful in expressing computer vision algorithms in succinct
algebraic form. For instance, in certain interpolation schemes it becomes necessary to switch from points with
real-valued coordinates (floating point coordinates) to corresponding integer-valued coordinate points. One
such method uses the induced floor operation defined by x = ( x1, x2, & , xn), where
and denotes the largest integer less than or equal to xi (i.e.,
xi d xi and if with k d xi, then k d xi).
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
characteristic function
It is important to note that several of the above unary operations are special instances of spatial
transformations X ’ Y. Spatial transforms play a vital role in many image processing and computer vision
tasks.
In the above summary we only considered points with real- or integer-valued coordinates. Points of other
spaces have their own induced operations. For example, typical operations on points of (i.e.,
Boolean-valued points) are the usual logical operations of AND, OR, XOR, and complementation.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Another set of operations on 2Z are the usual set operations of union, intersection, set difference (or relative
complement), symmetric difference, and Cartesian product as defined below.
union X * Y = {z : z X or z Y}
intersection X ) Y = {z : z X and z Y}
----------- set difference X\Y = {z : z X and z Y}
symmetric difference X”Y = {z : z X * Y and z X ) Y}
Cartesian product X × Y = {(x,y) : x X and y Y}
Note that with the exception of the Cartesian product, the set obtained for each of the above operations is
again an element of 2Z.
Another common set theoretic operation is set complementation. For X 2 Z, the complement of X is denoted
by , and defined as . In contrast to the binary set operations defined
above, set complementation is a unary operation. However, complementation can be computed in terms of the
binary operation of set difference by observing that .
In addition to complementation there are various other common unary operations which play a major role in
algorithm development using image algebra. Among these is the cardinality of a set which, when applied to a
finite point set, yields the number of elements in the set, and the choice function which, when applied to a set,
selects a randomly chosen point from the set. The cardinality of a set X will be denoted by card(X). Note that
while
choice : 2Z ’ Z.
The interpretation of sup(X) is as follows. Suppose X is finite, say X = {x1, x2, & , xk}. Then sup(X) = sup( &
sup(sup(sup(x1,x2),x3),x4), & , xn), where sup(xi,xj) denotes the binary operation of the supremum of two
points defined earlier. Equivalently, if xi = (xi,yi) for i = 1, &, k, then sup(X) = (x1 ¦ x2 ¦ & ¦ xk, y1 ¦ y2 ¦ & ¦ yk).
More generally, sup(X) is defined to be the least upper bound of X (if it exists). The infimum of X is
interpreted in a similar fashion.
If X is finite and has a total order, then we also define the maximum and minimum of X, denoted by and
, respectively, as follows. Suppose X = {x1, x2, & , xk} and , where the
symbol denotes the particular total order on X. Then and . The most commonly
used order for a subset X of is the row scanning order. Note also that in contrast to the supremum or
infimum, the maximum and minimum of a (finite totally ordered) set is always a member of the set.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
For the set we need to extend the arithmetic and logic operations of as follows:
Note that the element - acts as a null element the system if we view the operation + as
multiplication and the operation ¦ as addition. The same cannot be said about the element in the system
since (-)+ = +(-) = -. In order to remedy this situation we define the dual structure
of as follows:
Now the element + acts as a null element in the system . Observe, however, that the dual
additions + and +2 introduce an asymmetry between - and +. The resultant structure
is known as a bounded lattice ordered group [1].
Dual structures provide for the notion of dual elements. For each we define its dual or conjugate
r* by r* = -r, where -(-) = . The following duality laws are a direct consequence of this definition:
(1) (r*) * = r
(2) (r ¥ t)* = r* ¦ t* and (r ¦ t)* = r* ¥ t*.
Closely related to the additive bounded lattice ordered group described above is the multiplicative bounded
lattice ordered group . Here the dual ×2 of ordinary multiplication is defined as
Hence, the element 0 acts as a null element in the system and the element + acts as a null
element in the system . The conjugate r* of an element of this value set is defined
by
Another algebraic structure with duality which is of interest in image algebra is the value set
Note that the addition (as well as ) restricted to is the exclusive or operation xor and
computes the values for the truth table of the biconditional statement p ” q (i.e., p if and only if q).
The operations on the value set can be easily generalized to its k-fold Cartesian product
. Specifically, if and
, where for i = 1, & , k, then
.
The addition should not be confused with the usual addition mod2k on . In fact, for m,
, where
Many point sets are also value sets. For example, the point set is a metric space as well as a
vector space with the usual operation of vector addition. Thus, , where the symbol “+” denotes
vector addition, will at various times be used both as a point set and as a value set. Confusion as to usage will
not arise as usage should be clear from the discussion.
(f)
(g)
In contrast to structure c, the addition and multiplication in structure d is addition and multiplication mod2k.
These listed structures represent the pertinent global structures. In various applications only certain
subalgebras of these algebras are used. For example, the subalgebras and of
play special roles in morphological processing. Similarly, the subalgebra
of , where , is the only pertinent applicable
algebra in certain cases.
The complementary binary operations, whenever they exist, are assumed to be part of the structures. Thus, for
example, subtraction and division which can be defined in terms of addition and multiplication, respectively,
are assumed to be part of .
For the following operations assume that A, for some value set .
union A * B = {c : c A or c B}
intersection A ) B = {c : c A and c B}
set difference A\B = {c : c A and c B}
symmetric difference A”B = {c : c A * B and c A ) B}
Cartesian product A × B = {(a,b) : a A and b È B}
choice function choice(A) A
cardinality card(A) = cardinality of A
supremum sup(A) = supremum of A
infimum inf(A) = infimum of A
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Definition: Let be a value set and X a point set. An -valued image on X is any element of
. Given an -valued image (i.e., ), then is called the set of
possible range values of a and X the spatial domain of a.
It is often convenient to let the graph of an image represent a. The graph of an image is also
referred to as the data structure representation of the image. Given the data structure representation a =
{(x,a(x)) : x X}, then an element (x, a(x)) of the data structure is called a picture element or pixel. The first
coordinate x of a pixel is called the pixel location or image point, and the second coordinate a(x) is called the
pixel value of a at location x.
The above definition of an image covers all mathematical images on topological spaces with range in an
algebraic system. Requiring X to be a topological space provides us with the notion of nearness of pixels.
Since X is not directly specified we may substitute any space required for the analysis of an image or imposed
by a particular sensor and scene. For example, X could be a subset of or with x X of form x = (x,y,t),
where the first coordinates (x,y) denote spatial location and t a time variable.
Let a, . Then
a³b = {(x,c(x)) : c(x) = a(x)³b(x), x X}.
For example, suppose a, and our value set is the algebraic structure of the real numbers
. Replacing ³ by the binary operations +, ·, ¦, and ¥ we obtain the basic binary operations
a + b = {(x,c(x)) : c(x) = a(x) + b(x), x X},
a · b = {(x,c(x)) : c(x) = a(x) · b(x), x X},
a ¦ b = {(x,c(x)) : c(x) = a(x) ¦ b(x), x X},
and
a ¦ b = {(x,c(x)) : c(x) = a(x) ¦ b(x), x X)}
on real-valued images. Obviously, all four operations are commutative and associative.
In addition to the binary operation between images, the binary operation ³ on also induces the following
scalar operations on images:
For and ,
k³a = {(x,c(x)) : c(x) = k³a(x), x X}
and
a³k = {(x,c(x)) : c(x) = a(x)³k, x X}.
Thus, for , we obtain the following scalar multiplication and addition of real-valued images:
k·a = {(x,c(x)) : c(x) = k·a(x), x X}
and
k + a = {(x,c(x)) : c(x) = k + a(x), x X}.
It follows from the commutativity of real numbers that,
k·a = a·k and k + a = a + k.
Although much of image processing is accomplished using real-, integer-, binary-, or complex-valued images,
many higher-level vision tasks require manipulation of vector and set-valued images. A set-valued image is of
form . Here the underlying value set is , where the tilde symbol denotes
complementation. Hence, the operations on set-valued images are those induced by the Boolean algebra of the
value set. For example, if a, , then
a * b = {(x,c(x)) : c(x) = a(x) * b(x), x X},
a ) c = {(x,c(x)) : c(x) = a(x) * b(x), x X},
and
where .
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The operation of complementation is, of course, a unary operation. A particularly useful unary operation on
images which is induced by a binary operation on a value set is known as the global reduce operation. More
precisely, if ³ is an associative and commutative binary operation on and X is finite, say X = {x1, x2, ..., xn},
then ³ induces a unary operation
-----------
In all, the value set provides for four basic global reduce operations, namely
, and .
Note that in this definition we view the composition as a unary operation on with operand a. This
subtle distinction has the important consequence that f is viewed as a unary operation — namely a function
from to — and a as an argument of f. For example, substituting for and the sine function
for f, we obtain the induced operation , where
sin(a) = {(x, c(x)) : c(x) = sin(a(x)), x X}.
As another example, consider the characteristic function
Then for any is the Boolean (two-valued) image on X with value 1 at location x if a(x) e k
and value 0 if a(x) < k. An obvious application of this operation is the thresholding of an image. Given a
floating point image a and using the characteristic function
is given by
b = {(x, b(x)) : b(x) = a(x) if j d a(x) d k, otherwise b(x) = 0}.
The unary operations on an image discussed thus far have resulted either in a scalar (an element of
) by use of the global reduction operation, or another -valued image by use of the composition
. More generally, given a function , then the composition provides for a unary
operation which changes an -valued image into a -valued image f(a). Taking the same viewpoint, but
using a function f between spatial domains instead, provides a scheme for realizing naturally induced
operations for spatial manipulation of image data. In particular, if f : Y ’ X and , then we define the
induced image by
Thus, the operation defined by the above equation transforms an -valued image defined over the space X
into an -valued image defined over the space Y.
Examples of spatial based image transformations are affine and perspective transforms. For instance, suppose
, where is a rectangular m × n array. If and f : X ’ X is defined as
then is a one sided reflection of a across the line x = k. Further examples are provided by several of the
algorithms presented in this text.
Simple shifts of an image can be achieved by using either a spatial transformation or point addition. In
particular, given , and , we define a shift of a by y as
a + y = {(z, b(z)) : b(z) = a(z - y), z - y X}.
Note that a + y is an image on X + y since z - y X Ô z X + y, which provides for the equivalent formulation
a + y = {(z, b(z)) : b(z) = a(z - y), z X + y}.
Of course, one could just as well define a spatial transformation f : X + y ’ X by f(z) = z - y in order to obtain
the identical shifted image .
Another simple unary image operation that can be defined in terms of a spatial map is image transposition.
Given an image , then the transpose of a, denoted by a2, is defined as , where
is given by f(x,y) = (y,x).
where a is a non-negative real-valued image on X. We may extend this operation to a binary image operation
as follows: if a, , then
The notion of exponentiation can be extended to negative valued images as long as we follow the rules of
arithmetic and restrict this binary operation to those pairs of real-valued images for which
. This avoids creation of complex, undefined, and indeterminate pixel values such as
, and 00, respectively. However, there is one exception to these rules of standard arithmetic. The
algebra of images provides for the existence of pseudo inverses. For , the pseudo inverse of a, which
for reason of simplicity is denoted by a-1 is defined as
Note that if some pixel values of a are zero, then a·a-1 ` 1, where 1 denotes unit image all of whose pixel
values are 1. However, the equality a·a-1 ·a = a always holds. Hence the name “pseudo inverse.”
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The inverse of exponentiation is defined in the usual way by taking logarithms. Specifically,
logba = {(x,c(x)) : c(x) = logb(x)a(x), x X}.
As for real numbers, logba is defined only for positive images; i.e., a, .
-----------
Another set of examples of binary operations induced by unary operations are the characteristic functions for
comparing two images. For we define
Çdb (a) = {(x,c(x)) : c(x) = 1 if a(x) d b(x), otherwise c(x) = 0}
Ç<b (a) = {(x,c(x)) : c(x) = 1 if a(x) < b(x), otherwise c(x) = 0}
Ç=b (a) = {(x,c(x)) : c(x) = 1 if a(x) = b(x), otherwise c(x) = 0}
Çeb (a) = {(x,c(x)) : c(x) = 1 if a(x) e b(x), otherwise c(x) = 0}
Ç>b (a) = {(x,c(x)) : c(x) = 1 if a(x) > b(x), otherwise c(x) = 0}
Ç`b (a) = {(x,c(x)) : c(x) = 1 if a(x) ` b(x), otherwise c(x) = 0}.
Thus, . In practice, the user may specify Z explicitly by providing bounds for the coordinates of the
points of Z.
There is nothing magical about restricting a to a subset Z of its domain X. We can just as well define
restrictions of images to subsets of the range values. Specifically, if and , then the
restriction of a to S is denoted by a||S and defined as
In terms of the pixel representation of a||S we have a||S = {(x,a(x)) : a(x) S}. The double-bar notation is used
to focus attention on the fact that the restriction is applied to the second coordinate of .
Image restrictions in terms of subsets of the value set is an extremely useful concept in computer vision as
many image processing tasks are restricted to image domains over which the image values satisfy certain
properties. Of course, one can always write this type of restriction in terms of a first coordinate restriction by
setting Z = {x X : a(x) S} so that a||S = a|Z. However, writing a program statement such as b := a|Z is of
little value since Z is implicitly specified in terms of S; i.e., Z must be determined in terms of the property
“a(x) S.” Thus, Z would have to be precomputed, adding to the computational overhead as well as increased
code. In contrast, direct restriction of the second coordinate values to an explicitly specified set S avoids these
problems and provides for easier implementation.
As mentioned, restrictions to the range set provide a useful tool for expressing various algorithmic
procedures. For instance, if and S is the interval , where k denotes some given threshold
value, then a||(k,) denotes the image a restricted to all those points of X where a(x) exceeds the value k. In
order to reduce notation, we define a||>k a a||(k,) . Similarly,
a||ek a a||[k,) , a||<k a a||(-, k), a||k a a||{k}, and a||dk a a||(-, k].
As in the case of characteristic functions, a more general form of range restriction is given when S
corresponds to a set-valued image ; i.e., . In this case we define
a||S = {(x,a(x)) : a(x) S(x)}.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Combining the concepts of first and second coordinate (domain and range) restrictions provides the general
definition of an image restriction. If , Z 4 X, and , then the restriction of a to Z and S is
defined as
a|(Z,S) = a ) (Z × S).
-----------
It follows that a|(Z,S) = {(x,a(x)) : x Z and a(x) S}, a|(X,S) = a||S, and .
The extension of to on Y, where X and Y are subsets of the same topological space, is
denoted by a|b and defined by
defined by .
Similarly, the concept of domain is viewed as the function
where
yields the set of all points (pixel locations) where a(x) exceeds k, namely s = {x X : a(x) > k}. The statement
s := range(a||>k)
Note that .
Assuming the correct dimensionality in the first coordinate, concatenation of any number of images is defined
inductively using the formula (a | b|c) = ((a | b)|c) so that in general we have
If and , then a(x) is a vector of form a(x) = (a1(x), &, an(x)) where for each i = 1, & , n,
. Thus, an image is of form a = (a1, & , an) and with each vector value a(x) there are
associated n real values ai(x).
etc. In the special case where r = (r, r, & , r), we simply use the scalar and define r + a a r + a, r · a a
r · a, and so on.
As before, binary operations on multi-valued images are induced by the corresponding binary operation
on the value set . It turns out to be useful to generalize this concept by replacing
the binary operation ³ by a sequence of binary operations , where j = 1, … n, and defining
a³b a (a³1b,a³2b, & , a³nb).
then for a, and c = a³b, the components of c(x) = (c1(x), & , cn(x)) have values
for j = 1, & , n.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title As another example, suppose ³1 and ³2 are two binary operations defined by
and
-----------
(x1, x2)³2(y1, y2) = x1y2 + x2y1,
respectively. Now if a, represent two complex-valued images, then the productc = a³b represents
pointwise complex multiplication, namely
c(x) = (a1(x)b1(x) - a2(x)b2(x), a1(x)b2(x) + a2(x)b1(x)).
Basic operations on single and multi-valued images can be combined to form image processing operations of
arbitrary complexity. Two such operations that have proven to be extremely useful in processing real
vector-valued images are the winner take all jth-coordinate maximum and minimum of two images.
Specifically, if a, , then the jth-coordinate maximum of a and b is defined as
a ¦ |jb = {(x,c(x)) : c(x) = a(x) if aj(x) e bj(x), otherwise c(x) = b(x)},
Unary operations on vector-valued images are defined in a similar componentwise fashion. Given a function
, then f induces a function , again denoted by f, which is defined by
f(x1, x2, & , xn) a (f(x1), f(x2), & , f(xn)).
These functions provide for one type of unary operations on vector-valued images. In particular, if
, then
Thus, if , then
sin(a) = (sin(a1), & , sin(an)).
where . Such functions provide for a more complex type of unary image operations
since by definition
which means that the construction of each new coordinate depends on all the original coordinates. To provide
a specific example, define by f1(x,y) = sin(x) + cosh(y) and by f2(x, y) = cos(x)
+ sinh(y). Then the induced function given by f = (f1, f2). Applying f to an image
results in the image
Thus, if we represent complex numbers as points in and a denotes a complex-valued image, then f(a) is a
pointwise application of the complex sine function.
Global reduce operations are also applied componentwise. For example, if , and k = card(X), then
In contrast, the summation since each . Note that the projection function pi
is a unary operation .
Similarly,
and
a = ( a1, & , an).
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
concatenation
characteristics
Çdb(a) = {(x,c(x)) : c(x) = 1 if a(x) d b(x), otherwise c(x) = 0}
Ç<b(a) = {(x,c(x)) : c(x) = 1 if a(x) < b(x), otherwise c(x) = 0}
Ç=b(a) = {(x,c(x)) : c(x) = 1 if a(x) = b(x), otherwise c(x) = 0}
Çeb(a) = {(x,c(x)) : c(x) = 1 if a(x) e b(x), otherwise c(x) = 0}
Ç>b(a) = {(x,c(x)) : c(x) = 1 if a(x) > b(x), otherwise c(x) = 0}
Ç`b(a) = {(x,c(x)) : c(x) = 1 if a(x) ` b(x), otherwise c(x) = 0}
Whenever b is a constant image, say b = k (i.e., b(x) = k x X), then we simply write ak for ab and logka for
logba. Similarly, we have k+a, Çdk(a),Ç<k(a), etc.
extension
domain
range
generic reduction “a = a(x1)³a(x2)³ ··· ³a(xn)
image sum
image product
image maximum
image minimum
image complement
pseudo inverse
image transpose a2 = {((x,y), a2(x,y)) : a2(x,y) = a(y,x), (y,x) X}
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The pixel values ty(x) of this image are called the weights of the template at point y.
If t is a real- or complex-valued template from Y to X, then the support of ty is denoted by S(ty) and is defined
as
S(ty) = {x X : ty (x) ` 0}.
More generally, if and is an algebraic structure with a zero element 0, then the support of ty will
be defined as S(ty) = {x X : ty(x) ` 0}.
For extended real-valued templates we also define the following supports at infinity:
S (ty) = {x X : ty(x) ` }
and
S- (ty) = {x X : ty(x) ` -}.
If X is a space with an operation + such that (X, +) is a group, then a template is said to be
translation invariant (with respect to the operation +) if and only if for each triple x, y, z X we have that
ty(x) = ty+z (x + z). Templates that are not translation invariant are called translation variant or, simply,
variant templates. A large class of translation invariant templates with finite support have the nice property
that they can be defined pictorially. For example, let and y = (x,y) be an arbitrary point of X. Set x1
= (x, y - 1), x2 = (x + 1, y), and x3 = (x + 1, y - 1). Define by defining the weights ty(y) = 1, ty(x1) =
3, ty(x2) = 2, ty(x3) = 4, and ty(x) = 0 whenever x is not an element of {y, x1, x2, x3 }. Note that it follows from
the definition of t that S(ty) = {y, x1, x2, x3}. Thus, at any arbitrary point y, the configuration of the support
and weights of ty is as shown in Figure 1.5.1. The shaded cell in the pictorial representation of ty indicates the
location of the point y.
There are certain collections of templates that can be defined explicitly in terms of parameters. These
parameterized templates are of great practical importance.
of form . The set P is called the set of parameters and each p P is called a
parameter of t.
Thus, a parameterized -valued template from Y to X gives rise to a family of regular -valued templates
from Y to X, namely .
Image-Template Products
The definition of an image-template product provides the rules for combining images with templates and
templates with templates. The definition of this product includes the usual correlation and convolution
products used in digital image processing. Suppose is a value set with two binary operations and ³,
where distributes over ³, and ³ is associative and commutative. If , then for each
. Thus, if a , where X is finite, then a and “ . It
follows that the binary operations and ³ induce a binary operation
where
is defined by
The expression is called the right product of a with t. Note that while a is an image on X, the product
is an image on Y. Thus, templates allow for the transformation of an image from one type of domain to
an entirely different domain type.
Replacing by changes into
b = a •t,
the linear image-template product, where
, and .
where
is defined by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
For the remainder of this section we assume that is a monoid and let 0 denote the zero of under the
operation ³. Suppose and , where X and Z are subsets of the same space. Since is a
monoid, the operator can be extended to a mapping
The left product is defined in a similar fashion. Subsequent examples will demonstrate that the ability
of replacing X with Z greatly simplifies the issue of template implementation and the use of templates in
algorithm development.
Significant reduction in the number of computations involved the image-template product can be achieved if
As pointed out earlier, substitution of different value sets and specific binary operations for ³ and results in
a wide variety of different image transforms. Our prime examples are the ring and the value sets
and . The structure provides for two
lattice products:
where
and
where
In order to distinguish between these two types of lattice transforms, we call the operator the additive
maximum and the additive minimum. It follows from our earlier discussion that if
, then the value of b(y) is -, the zero of under the operation of ¦. Similarly,
if , then b(y) = .
The left additive max and min operations are defined by
and
respectively. The relationship between the additive max and min is given in terms of lattice duality by
where the image a* is defined by a*(x) = [a(x)]*, and the conjugate (or dual) of is the template
The value set also provides for two lattice products. Specifically, we have
where
and
where
Here 0 is the zero of under the operation of ¦, so that b(y) = 0 whenever . Similarly,
b(y) = whenever .
The lattice products and are called the multiplicative maximum and multiplicative minimum,
respectively. The left multiplicative max and left multiplicative min are defined as
and
respectively. The duality relation between the multiplicative max and min is given by
In the following list of pertinent image-template products and . Again, for each operation
we assume the appropriate value set .
right generic product
the induced operation s³t is defined in terms of the induced binary image operation on ,
The unary template operations of prime importance are the global reduce operations. Suppose Y is a finite
point set, say Y = {y1, y2, …, yn}, and . Any binary semigroup operation ³ on induces a global
reduce operation
which is defined by
Thus, for example, if and ³ is the operation of addition (³ = +), then “ = £ and
In all, the value set provides for four basic global reduce operations, namely
.
If the value set has two binary operations ³ and so that is a ring (or semiring), then under
the induced operations is also a ring (or semiring). Analogous to the image-template product, the
binary operations and ³ induce a template convolution product
defined as follows. Suppose , and X a finite point set. Then the template product
, where , is defined as
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
The lattice product is defined in a similar manner. For and ,
the product template r is given by
-----------
The following example provides a specific instance of the above product formulation.
If are defined as above with values - outside the support, then the template product
is the template defined by
The template t is not an -valued template. To provide an example of the template product , we
redefine t as
Then is given by
The utility of template products stems from the fact that in semirings the equation
holds [1]. This equation can be utilized in order to reduce the computational burden associated with typical
where
The construction of the new image b := a•r requires nine multiplications and eight additions per pixel (if we
ignore boundary pixels). In contrast, the computation of the image b := (a•s) •t requires only six
multiplications and four additions per pixel. For large images (e.g., size 1024 × 1024) this amounts to
significant savings in computation.
sum reduce
product reduce
max reduce
min reduce
In the next list, , X is a finite point set, and denotes the appropriate value set.
generic template product
linear template product
Definition. A partially ordered set (or poset) is a set P together with a binary relation
, satisfying the following three axioms for arbitrary x, y, z P:
(i) (reflexive)
(ii) (antisymmetric)
(iii) (transitive)
Now suppose that X is a point set, Y is a partially ordered point set with partial order , and a monoid. An
-valued recursive template t from Y to X is a function , where
and , such that
1. and
2. for each .
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The recursive template operation computes a new pixel value b(y) based on both the pixel values a(x) of
the source image and some previously calculated new pixel values b(z) which are determined by the partial
order and the region of support of the participating template. By definition of a recursive template,
for every and . Therefore, b(y) is always recursively computable.
Some partial orders that are commonly used in two-dimensional recursive transforms are forward and
backward raster scanning and serpentine scanning.
It follows from the definition of that the computation of a new pixel b(y) can be done only after all its
predecessors (ordered by ) have been computed. Thus, in contrast to nonrecursive template operations,
recursive template operations are not computed in a globally parallel fashion.
Note that if the recursive template t is defined such that for all y Y, then one obtains the
usual nonrecursive template operation
Hence, recursive template operations are natural extensions of nonrecursive template operations.
Recursive additive maximum and multiplicative minimum are defined in a similar fashion. Specifically, if
and , then
is defined by
For and ,
is defined by
The operations of the recursive additive minimum and multiplicative minimum ( and ) are defined
in the same straightforward fashion.
Recursive additive maximum, minimum as well as recursive multiplicative maximum and minimum are
nonlinear operations. However, the recursive linear product remains a linear operation.
The basic recursive template operations described above can be easily generalized to the generic recursive
image-template product by simple substitution of the specific operations, such as multiplication and addition,
by the generic operations and ³. More precisely, given a semiring with identity, then one can
define the generic recursive product
by defining by
Again, in addition to the basic recursive template operations discussed earlier, a wide variety of recursive
template operations can be derived from the generalized recursive rule by substituting different binary
operations for and ³. Additionally, parameterized recursive templates are defined in the same manner as
parametrized nonrecursive templates; namely as functions
The definition of the left recursive product is also straightforward. However, for sake of brevity and
since the different left products are not required for the remainder of this text, we dispense with their
formulation. Additional facts about recursive products, their properties and applications can be found in [1,
56, 57].
1.7. Neighborhoods
There are several types of template operations that are more easily implemented in terms of neighborhood
operations. Typically, neighborhood operations replace template operations whenever the values in the
support of a template consist only of the unit elements of the value set associated with the template. A
template with the property that for each y Y, the values in the support of ty consist only of the
unit of is called a unit template.
For example, the invariant template shown in Figure 1.7.1 is a unit template with respect to the
value set since the value 1 is the unit with respect to multiplication.
Figure 1.7.1 The unit Moore template for the value set .
Similarly, the template shown in Figure 1.7.2 is a unit template with respect to the value set
since the value 0 is the unit with respect to the operation +.
Figure 1.7.2 The unit von Neumann template for the value set .
We need to point out that the difference between the mathematical equality b = a •t and the pseudocode
statement b := a •t is that in the latter the new image is computed only for those points y for which
. Observe that since a(x) · 1 = a(x) and M(y) = S(ty), where M(y) denotes the Moore
neighborhood of y (see Figure 1.2.2), it follows that
This observation leads to the notion of neighborhood reduction. In implementation, neighborhood reduction
avoids unnecessary multiplication by the unit element and, as we shall shortly demonstrate, neighborhood
reduction also avoids some standard boundary problems associated with image-template products.
To precisely define the notion of neighborhood reduction we need a more general notion of the reduce
operation , which was defined in terms of a binary operation ³ on . The more general
form of “ is a function
where .
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
For example, if , is an m × n array of points, then one such function
could be defined as
-----------
is defined as , where
.
Now suppose is a unit template with respect to the operation of the semiring
, is a neighborhood system defined by N(y) = S(ty), and . It then
follows that is given by
for each y Y. Note that the product is similar to the image template product in that is a function
Unit templates act like characteristic functions in that they do not weigh a pixel, but simply note which pixels
are in their support and which are not. When employed in the image-template operations of their semiring,
they only serve to collect a number of values that need to be reduced by the gamma operation. For this reason,
unit templates are also referred to as characteristic templates. Now suppose that we wish to describe a
translation invariant unit template with a specific support such as the 3 × 3 support of the Moore template t
shown in Figure 1.7.1. Suppose further that we would like this template to be used with a variety of reduction
operations, for instance, summation and maximum. In fact, we cannot describe such an operand without
regard of the image-template operation by which it will be used. For us to derive the expected results, the
template must map all points in its support to the unitary value with respect to the combining operation .
Thus, for the reduce operation of summation , the unit values in the support must be 1, while for the
maximum reduce operation , the values in the support must all be 0. Therefore, we cannot define a single
template operand to characterize a neighborhood for reduction without regard to the image-template operation
to be used to reduce the values within the neighborhood. However, we can capture exactly the information of
interest in unit templates with the simple notion of neighborhood function. Thus, for example, the Moore
neighborhood M can be used to add the values in every 3 × 3 neighborhood as well as to find the maximum or
minimum in such a neighborhood by using the statements a • M, , and , respectively. This is one
advantage for replacing unit templates with neighborhoods.
Another advantage of using neighborhoods instead of templates can be seen by considering the simple
example of image smoothing by local averaging. Suppose , where is an m × n array of
points, and is the 3 × 3 unit Moore template with unit values 1. The image b obtained from the
statement represents the image obtained from a by local averaging since the new pixel
value b(y) is given by
where N(y) + q a {x + q : x N(y)}. Just as for template composition, algorithm optimization can be achieved
by use of the equation for appropriate neighborhood functions and
neighborhood reduction functions “. For , the kth iterate of a neighborhood is
defined inductively as Nk = Nk-1 •N, where .
Most neighborhood functions used in image processing are translation invariant subsets of (in particular,
subsets of ). A neighborhood function is said to be translation invariant if
N(y + p) = N(y) + p for every point . Given a translation invariant neighborhood N, we define its
reflection or conjugate N* by N*(y) = N*(0) + y, where N*(0) = {-x : x N(0)} and denotes
the origin. Conjugate neighborhoods play an important role in morphological image processing.
Note also that for a translation invariant neighborhood N, the kth iterate of N can be expressed in terms of the
sum of sets
Nk(y) = Nk-1 (y) + N(0)
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
neigborhood sum
neighborhood maximum
neighborhood minimum
Note that
and
Since rp(i,k) < rp (i2,k2) Ô i < i2 or i = i2 and k < k2, rp linearizes the array using the row
scanning order as shown:
It follows that the row-scanning order on is given by
or, equivalently, by
The factor of the Cartesian product is decomposed in a similar fashion. Here the
row-scanning map is given by
The p-product or generalized matrix product of A and B is denoted by A •pB, and is the matrix
defined by
where c(s,j)(i,t) denotes the (s, j)th row and (i, t)th column entry of C. Here we use the lexicographical order (s,
j) < (s2, j2) Ô s < s2 or if s = s2, j < j2. Thus, the matrix C has the following form:
The entry c(s,j)(i,t) in the (s,j)-row and (i,t)-column is underlined for emphasis.
To provide an example, suppose that l = 2, m = 6, n = 4, and q = 3. Then for p = 2, one obtains m/p = 3, n/p =
2 and 1 d k d 2. Now let
and
Then the (2,1)-row and (2,3)-column element c(2,1)(2,3) of the matrix
is given by
Thus, in order to compute c(2,1)(2,3), the two underlined elements of A are combined with the two underlined
elements of B as illustrated:
In particular,
If
then
This shows that the transpose property, which holds for the regular matrix product, is generally false for the
p-product. The reason is that the p-product is not a dual operation in the transpose domain. In order to make
the transpose property hold we define the dual operation of •p by
It follows that
and the p-product is the dual operation of . In particular, we now have the transpose property
.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Since the operation is defined in terms of matrix transposition, labeling of matrix indices are reversed.
Specifically, if A = (ast) is an l × m matrix, then A gets reindexed as A = (as,(kj)), using the convention
-----------
Similarly, if B = (bst) is an n × q matrix, then the entries of B are relabeled as b(i,k),t, using the convention
Thus, in order to compute c63, the two underlined elements of A are combined with the two underlined
elements of B as illustrated:
As a final observation, note that the matrices A, B, and C in this example have the form of the transposes of
the matrices B, A, and C, respectively, of the previous example.
1.9. References
1 G. Ritter, “Image algebra with applications.” Unpublished manuscript, available via anonymous ftp
from ftp://ftp.cis.ufl.edu/pub/src/ia/documents, 1994.
2 S. Unger, “A computer oriented toward spatial problems,” Proceedings of the IRE, vol. 46, pp.
1144-1750, 1958.
3 J. von Neumann, “The general logical theory of automata,” in Cerebral Mechanism in Behavior:
The Hixon Symposium, New York, NY: John Wiley & Sons, 1951.
4 J. von Neumann, Theory of Self-Reproducing Automata. Urbana, IL: University of Illinois Press,
1966.
5 K. Batcher, “Design of a massively parallel processor,” IEEE Transactions on Computers, vol. 29,
no. 9, pp. 836-840, 1980.
6 M. Duff, D. Watson, T. Fountain, and G. Shaw, “A cellular logic array for image processing,”
Pattern Recognition, vol. 5, pp. 229-247, Sept. 1973.
7 M. J. B. Duff, “Review of CLIP image processing system,” in Proceedings of the National
Computing Conference, pp. 1055-1060, AFIPS, 1978.
8 M. Duff, “Clip4,” in Special Computer Architectures for Pattern Processing (K. Fu and T.
Ichikawa, eds.), ch. 4, pp. 65-86, Boca Raton, FL: CRC Press, 1982.
9 T. Fountain, K. Matthews, and M. Duff, “The CLIP7A image processor,” IEEE Pattern Analysis
and Machine Intelligence, vol. 10, no. 3, pp. 310-319, 1988.
10 L. Uhr, “Pyramid multi-computer structures, and augmented pyramids,” in Computing Structures
for Image Processing (M. Duff, ed.), pp. 95-112, London: Academic Press, 1983.
11 L. Uhr, Algorithm-Structured Computer Arrays and Networks. New York, NY: Academic Press,
1984.
12 W. Hillis, The Connection Machine. Cambridge, MA: The MIT Press, 1985.
13 J. Klein and J. Serra, “The texture analyzer,” Journal of Microscopy, vol. 95, 1972.
14 R. Lougheed and D. McCubbrey, “The cytocomputer: A practical pipelined image processor,” in
Proceedings of the Seventh International Symposium on Computer Architecture, pp. 411-418, 1980.
15 S. Sternberg, “Biomedical image processing,” Computer, vol. 16, Jan. 1983.
16 R. Lougheed, “A high speed recirculating neighborhood processing architecture,” in
Architectures and Algorithms for Digital Image Processing II, vol. 534 of Proceedings of SPIE, pp.
22-33, 1985.
17 E. Cloud and W. Holsztynski, “Higher efficiency for parallel processors,” in Proceedings IEEE
Southcon 84, (Orlando, FL), pp. 416-422, Mar. 1984.
18 E. Cloud, “The geometric arithmetic parallel processor,” in Proceedings Frontiers of Massively
Parallel Processing, George Mason University, Oct. 1988.
19 E. Cloud, “Geometric arithmetic parallel processor: Architecture and implementation,” in
Parallel Architectures and Algorithms for Image Understanding (V. Prasanna, ed.), (Boston, MA.),
Academic Press, Inc., 1991.
20 H. Minkowski, “Volumen und oberflache,” Mathematische Annalen, vol. 57, pp. 447-495, 1903.
21 H. Minkowski, Gesammelte Abhandlungen. Leipzig-Berlin: Teubner Verlag, 1911.
22 H. Hadwiger, Vorlesungen Über Inhalt, Oberflæche und Isoperimetriem. Berlin: Springer-Verlag,
1957.
23 G. Matheron, Random Sets and Integral Geometry. New York: Wiley, 1975.
24 J. Serra, “Introduction a la morphologie mathematique,” booklet no. 3, Cahiers du Centre de
Morphologie Mathematique, Fontainebleau, France, 1969.
25 J. Serra, “Morphologie pour les fonctions “a peu pres en tout ou rien”,” technical report, Cahiers
du Centre de Morphologie Mathematique, Fontainebleau, France, 1975.
26 J. Serra, Image Analysis and Mathematical Morphology. London: Academic Press, 1982.
27 T. Crimmins and W. Brown, “Image algebra and automatic shape recognition,” IEEE
Transactions on Aerospace and Electronic Systems, vol. AES-21, pp. 60-69, Jan. 1985.
28 R. Haralick, S. Sternberg, and X. Zhuang, “Image analysis using mathematical morphology: Part
I,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 532-550, July 1987.
29 R. Haralick, L. Shapiro, and J. Lee, “Morphological edge detection,” IEEE Journal of Robotics
and Automation, vol. RA-3, pp. 142-157, Apr. 1987.
30 P. Maragos and R. Schafer, “Morphological skeleton representation and coding of binary
images,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-34, pp.
1228-1244, Oct. 1986.
31 P. Maragos and R. Schafer, “Morphological filters Part II : Their relations to median,
order-statistic, and stack filters,” IEEE Transactions on Acoustics, Speech, and Signal Processing,
vol. ASSP-35, pp. 1170-1184, Aug. 1987.
32 P. Maragos and R. Schafer, “Morphological filters Part I: Their set-theoretic analysis and
relations to linear shift-invariant filters,” IEEE Transactions on Acoustics, Speech, and Signal
Processing, vol. ASSP-35, pp. 1153-1169, Aug. 1987.
33 J. Davidson and A. Talukder, “Template identification using simulated annealing in morphology
neural networks,” in Second Annual Midwest Electro-Technology Conference, (Ames, IA), pp. 64-67,
IEEE Central Iowa Section, Apr. 1993.
34 J. Davidson and F. Hummer, “Morphology neural networks: An Introduction with applications,”
IEEE Systems Signal Processing, vol. 12, no. 2, pp. 177-210, 1993.
35 E. Dougherty, “Unification of nonlinear filtering in the context of binary logical calculus, part ii:
Gray-scale filters,” Journal of Mathematical Imaging and Vision, vol. 2, pp. 185-192, Nov. 1992.
36 D. Schonfeld and J. Goutsias, “Optimal morphological pattern restoration from noisy binary
images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, pp. 14-29, Jan.
1991.
37 J. Goutsias, “On the morphological analysis of discrete random shapes,” Journal of Mathematical
Imaging and Vision, vol. 2, pp. 193-216, Nov. 1992.
38 L. Koskinen and J. Astola, “Asymptotic behaviour of morphological filters,” Journal of
Mathematical Imaging and Vision, vol. 2, pp. 117-136, Nov. 1992.
39 S. R. Sternberg, “Language and architecture for parallel image processing,” in Proceedings of the
Conference on Pattern Recognition in Practice, (Amsterdam), May 1980.
40 S. Sternberg, “Overview of image algebra and related issues,” in Integrated Technology for
Parallel Image Processing (S. Levialdi, ed.), London: Academic Press, 1985.
41 P. Maragos, A Unified Theory of Translation-Invariant Systems with Applications to
Morphological Analysis and Coding of Images. Ph.D. dissertation, Georgia Institute of Technology,
Atlanta, 1985.
42 J. Davidson, Lattice Structures in the Image Algebra and Applications to Image Processing. PhD
thesis, University of Florida, Gainesville, FL, 1989.
43 J. Davidson, “Classification of lattice transformations in image processing,” Computer Vision,
Graphics, and Image Processing: Image Understanding, vol. 57, pp. 283-306, May 1993.
44 P. Miller, “Development of a mathematical structure for image processing,” optical division tech.
report, Perkin-Elmer, 1983.
45 J. Wilson, D. Wilson, G. Ritter, and D. Langhorne, “Image algebra FORTRAN language, version
3.0,” Tech. Rep. TR-89-03, University of Florida CIS Department, Gainesville, 1989.
46 J. Wilson, “An introduction to image algebra Ada,” in Image Algebra and Morphological Image
Processing II, vol. 1568 of Proceedings of SPIE, (San Diego, CA), pp. 101-112, July 1991.
47 J. Wilson, G. Fischer, and G. Ritter, “Implementation and use of an image processing algebra for
programming massively parallel computers,” in Frontiers ’88: The Second Symposium on the
Frontiers of Massively Parallel Computation, (Fairfax, VA), pp. 587-594, 1988.
48 G. Fischer and M. Rowlee, “Computation of disparity in stereo images on the Connection
Machine,” in Image Algebra and Morphological Image Processing, vol. 1350 of Proceedings of
SPIE, pp. 328-334, 1990.
49 D. Crookes, P. Morrow, and P. McParland, “An algebra-based language for image processing on
transputers,” IEE Third Annual Conference on Image Processing and its Applications, vol. 307, pp.
457-461, July 1989.
50 D. Crookes, P. Morrow, and P. McParland, “An implementation of image algebra on transputers,”
tech. rep., Department of Computer Science, Queen’s University of Belfast, Northern Ireland, 1990.
51 J. Wilson, “Supporting image algebra in the C++ language,” in Image Algebra and
Morphological Image Processing IV, vol. 2030 of Proceedings of SPIE, (San Diego, CA), pp.
315-326, July 1993.
52 University of Florida Center for Computer Vision and Visualization, Software User Manual for
the iac++ Class Library, version 1.0 ed., 1994. Avialable via anonymous ftp from ftp.cis.ufl.edu in
/pub/ia/documents.
53 G. Birkhoff and J. Lipson, “Heterogeneous algebras,” Journal of Combinatorial Theory, vol. 8,
pp. 115-133, 1970.
54 University of Florida Center for Computer Vision and Visualization, Software User Manual for
the iac++ Class Library, version 2.0 ed., 1995. Avialable via anonymous ftp from ftp.cis.ufl.edu in
/pub/ia/documents.
55 G. Ritter, J. Davidson, and J. Wilson, “Beyond mathematical morphology,” in Visual
Communication and Image Processing II, vol. 845 of Proceedings of SPIE, (Cambridge, MA), pp.
260-269, Oct. 1987.
56 D. Li, Recursive Operations in Image Algebra and Their Applications to Image Processing. PhD
thesis, University of Florida, Gainesville, FL, 1990.
57 D. Li and G. Ritter, “Recursive operations in image algebra,” in Image Algebra and
Morphological Image Processing, vol. 1350 of Proceedings of SPIE, (San Diego, CA), July 1990.
58 R. Gonzalez and P. Wintz, Digital Image Processing. Reading, MA: Addison-Wesley, second ed.,
1987.
59 G. Ritter, “Heterogeneous matrix products,” in Image Algebra and Morphological Image
Processing II, vol. 1568 of Proceedings of SPIE, (San Diego, CA), pp. 92-100, July 1991.
60 R. Cuninghame-Green, Minimax Algebra: Lecture Notes in Economics and Mathematical
Systems 166. New York: Springer-Verlag, 1979.
61 G. Ritter and H. Zhu, “The generalized matrix product and its applications,” Journal of
Mathematical Imaging and Vision, vol. 1(3), pp. 201-213, 1992.
62 H. Zhu and G. X. Ritter, “The generalized matrix product and the wavelet transform,” Journal of
Mathematical Imaging and Vision, vol. 3, no. 1, pp. 95-104, 1993.
63 H. Zhu and G. X. Ritter, “The p-product and its applications in signal processing,” SIAM Journal
of Matrix Analysis and Applications, vol. 16, no. 2, pp. 579-601, 1995.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 2
Image Enhancement Techniques
----------- 2.1. Introduction
The purpose of image enhancement is to improve the visual appearance of an image, or to transform an image
into a form that is better suited for human interpretation or machine analysis. Although there exists a
multitude of image enhancement techniques, surprisingly, there does not exist a corresponding unifying
theory of image enhancement. This is due to the absence of a general standard of image quality that could
serve as a design criterion for image enhancement algorithms. In this chapter we consider several techniques
that have proved useful in a wide variety of applications.
For actual implementation the summation will probably involve the loop
Figure 2.2.1 Averaging of multiple images for different values of k. Additional explanations are given in the
comments section.
where a0 is the true (uncorrupted by noise) image and ·i (x) is a random variable representing the introduction
of noise (see Figure 2.2.1). The averaging multiple images technique assumes that the noise is uncorrelated
and has mean equal zero. Under these assumptions the law of large numbers guarantees that as k increases,
Additional details about the effects of this simple technique can be found in Gonzalez and Wintz [1].
The actual mathematical formulation of this method is as follows. Suppose denotes the source image
and N : X ’ 2X a neighborhood function. If ny denotes the number of points in N(y) 4 X, then the enhanced
image b is given by
For the precise formulation, let and for y X, let N(y) denote the desired neighborhood of y.
Usually, N(y) is a 3 × 3 Moore neighborhood. Define
where a0 = a.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The recursion algorithm may reintroduce pixels with values 1 that had been eliminated at a previous stage.
The following alternative recursion formulation avoids this phenomenon:
Figure 2.7.2 The lower-resolution image is shown on the left and the smoothened
version of a on the right.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The median image m is calculated with the following image algebra pseudocode:
Comments and Observations
The figures below offer a comparison between the results of applying averaging filter and a median filters.
Figure 2.8.1 is the source image of a noisy jet. Figures 2.8.2 and 2.8.3 show the results of applying averaging
and median filter, respectively, over 3 × 3 neighborhoods of the noisy jet image.
Figure 2.8.2 Noisy jet smoothed with averaging filter over a 3 × 3 neighborhood.
Figure 2.8.3 Result of applying median filter over a 3 × 3 neighborhood to the noisy jet image.
Coffield [7] describes a stack comparator architecture which is particularly well suited for image processing
tasks that involve order statistic filtering. His methodology is similar to the alternative image algebra
formulation given above.
or, equivalently,
A between 0 and 1 results in a smoothing of the source image. A ³ greater than 1 emphasizes the
high-frequency components of the source image, which sharpens detail. Figure 2.9.1 shows the result of
applying the unsharp masking technique to a mammogram for several values of ³. A 3 ×3 averaging
neighborhood was used to produce the low-frequency component image b.
A more general formulation for unsharp masking is given by
where . Here ± is the weighting of the high-frequency component and ² is the weighting of the
low-frequency component.
where t is the template defined in Figure 2.9.2. The values of v and w are and , respectively.
Figure 2.9.2 The Moore configuration template for unsharp masking using a simple convolution.
The image actually represents the local standard deviation while ² > 0 is a lower bound
applied to d in order to avoid problems with division by zero in the next step of this technique.
The enhancement technique of [11] is a high-frequency emphasis scheme which uses local standard deviation
to control gain. The enhanced image b is given by
As seen in Section 2.9, a - m represents a highpass filtering of a in the spatial domain. The local gain factor
1/d is inversely proportional to the local standard deviation. Thus, a larger gain will be applied to regions with
low contrast. This enhancement technique is useful for images whose information of interest lies in the
high-frequency component and whose contrast levels vary widely from region to region.
The local contrast enhancement technique of Narendra and Fitch [12] is similar to the one above, except for a
slightly different gain factor and the addition of a low-frequency component. In this technique, let
where 0 < ± lt; 1. The addition of the low-frequency component m is used to restore the average intensity
level within local regions.
We need to remark that in both techniques above it may be necessary to put limits on the gain factor to
prevent excessive gain variations.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The mathematical formulation is as follows. Let denote the source image, n = card(X), and nj be the
number of times the gray level j occurs in the image a. Recall that . The enhanced
image b is given by
where the variable z, z1 e z e zn, represents a gray level value of the source image and g, g1 e g e gm, is the
variable that represents a gray level value of the enhanced image. The method of deriving a transfer function
to produce an enhanced image with a prescribed distribution of pixel values can be found in either Gonzalez
and Wintz [1] or Pratt [2].
Table 2.12.1 lists the transfer functions for several of the histogram modification examples featured in Pratt
[2]. In the table P(z) is the cumulative probability distribution of the input image and ± is an empirically
derived constant. The image algebra formulations for histogram modification based on the transfer functions
in Table 2.12.1 are presented next.
Table 2.12.1 Histogram Transfer Functions
Name of modification Transfer function T(z)
Uniform modification: g=[gm - g1]P(z) + g1
Exponential modification:
Rayleigh modification:
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Suppose where . Let â denote the Fourier transform of a, and denote the lowpass
filter transfer function. The enhanced image g is given by
where is the filter which attenuates high frequencies, and denotes the inverse Fourier transform.
Sections 8.2 and 8.4 present image algebra implementations of the Fourier transform and its inverse.
Table 2.12.2 Image Algebra Formulation of Histogram Transfer Functions
Name of modification Image algebra formulation
Uniform modification:
Exponential modification:
Rayleigh modification:
Let be the distance from the point (u, v) to the origin of the frequency plane; that is,
The transfer function of the ideal lowpass filter is given by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
represents a disk of radius d of unit height centered at the origin (0, 0). On the other hand, the Fourier
transform of an image , where , results in a complex valued image . In
other words, the location of the origin with respect to is not the center of X, which is the point ,
but the upper left-hand corner of â. Therefore, the product â is undefined. Simply shifting the image â so that
the midpoint of X moves to the origin, or shifting the disk image so that the disk’s center will be located at the
midpoint of X, will result in properly aligned images that can be multiplied, but will not result in a lowpass
filter since the high frequencies, which are located at the corners of â, would be eliminated.
There are various options for the correct implementation of the lowpass filter. One such options is to center
the spectrum of the Fourier transform at the midpoint of X (Section 8.3) and then multiplying the centered
transform by the shifted disk image. The result of this process needs to be uncentered prior to applying the
inverse Fourier transform. The exact specification of the ideal lowpass filter is given by the following
algorithm.
Suppose and , where m and n are even integers. Define the point set
and set
Figure 2.13.5 Lowpass filtering with ideal filter at various power levels.
The algorithm now becomes
In this algorithm, 1 and 0 denote the complex-valued unit and zero image, respectively.
and translate the image directly to the corners of X in order to obtain the
desired multiplication result. Specifically, we obtain the following form of the lowpass filter algorithm:
Figure 2.13.6 Illustration of the basic steps involved in the ideal lowpass filter.
Here means restricting the image d to the point set X and then extending d on X to the zero image
wherever it is not defined on X. The basic steps of this algorithm are shown in Figure 2.13.7.
Figure 2.13.7 Illustration of the basic steps as specified by the alternate version of the lowpass filter.
As for the ideal lowpass filter, there are several way for implementing the kth-order Butterworth lowpass
filter. The transfer function for this filter is defined over the point set , and is defined by
where c denotes the scaling constant and the value needs to be converted to a complex value. For
correct multiplication with the image â, the transfer function needs to be shifted to the corners of X. The
exact specification of the remainder of the algorithm is given by the following two lines of pseudocode:
Once the transfer function is specified, the remainder of the algorithm is analogous to the lowpass filter
algorithm. Thus, one specification would be
This pseudocode specification can also be used for the Butterworth and exponential highpass filter transfer
functions which are described next.
The Butterworth highpass filter transfer function of order k with scaling constant c is given by
where
The transfer function for exponential highpass filtering is given by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 3
Edge Detection and Boundary Finding Techniques
----------- 3.1. Introduction
Edge detection is an important operation in a large number of image processing applications such as image
segmentation, character recognition, and scene analysis. An edge in an image is a contour across which the
brightness of the image changes abruptly in magnitude or in the rate of change of magnitude.
Generally, it is difficult to specify a priori which edges correspond to relevant boundaries in an image. Image
transforms which enhance and/or detect edges are usually task domain dependent. We, therefore, present a
wide variety of commonly used transforms. Most of these transforms can be categorized as belonging to one
of the following four classes: (1) simple windowing techniques to find the boundaries of Boolean objects, (2)
transforms that approximate the gradient, (3) transforms that use multiple templates at different orientations,
and (4) transforms that fit local intensity values with parametric edge models.
The goal of boundary finding is somewhat different from that of edge detection methods which are generally
based on intensity information methods classified in the preceding paragraph. Gradient methods for edge
detection, followed by thresholding, typically produce a number of undesired artifacts such as missing edge
pixels and parallel edge pixels, resulting in thick edges. Edge thinning processes and thresholding may result
in disconnected edge elements. Additional processing is usually required in order to group edge pixels into
coherent boundary structures. The goal of boundary finding is to provide coherent one-dimensional boundary
features from the individual local edge pixels.
(b) The image b is an interior 8-boundary image if B is 8-connected, B 4 A, and B is the set of points
in A whose 4-neighborhoods intersect A2. The interior 8-boundary b can be expressed as
(c) The image b is an exterior 4-boundary image if B is 4-connected, B 4 A2, and B is the set of points
in A2 whose 8-neighborhoods intersect A. That is, the image b is defined by
(d) The image b is an interior 4-boundary image if B is 4-connected, B 4 A, and B is the set of points
in A whose 8-neighborhoods intersect A2. Thus,
Figure 3.2.1 below illustrates the boundaries just described. The center image is the original image. The
8-boundaries are to the left, and the 4-boundaries are to the right. Exterior boundaries are black. Interior
boundaries are gray.
Figure 3.2.1 Interior and exterior 8-boundaries (left), original image (center), and interior and exterior
4-boundaries (right).
Let a {0, 1} X be the source image. The boundary image will be denoted by b.
(1) Exterior 8-boundary —
Let be the source image. The edge enhanced image can be obtained by one of the
following difference methods:
(1) Horizontal differencing
or
or
or
Given the source image , the edge enhanced image is obtained by the appropriate
image-template convolution below.
(1) Horizontal differencing
or
where
(2) Vertical differencing
or
where
or
where the templates t and r are defined as above and v, w are defined by
and
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
-----------
Figure 3.3.4 Gradient approximation: left |a •t| + |a •r|, right |a •v| + |a •w|.
Let be the source image, and a0, a1, …, a7 denote the pixel values of the eight neighbors of (i, j)
enumerated in the counterclockwise direction as follows:
Let u = (a5 + a6 + a7) - (a1 + a2 + a3) and v = (a0 + a1 + a7) - (a3 + a4 + a5). The edge enhanced image
is given by
Pictorially we have:
Here we use the common programming language convention for the arctangent of two variables which
defines
Figure 3.5.1 Motorcycle and the image that represents the magnitude component of the Prewitt edge
detector.
Let be the source image, and a0, a1, …, a7 denote the pixel values of the eight neighbors of (i, j)
enumerated in the counterclockwise direction as follows:
The Sobel edge magnitude image is given by
where
and
or
Pictorially, t is given by
Comments and Observations
The Wallis edge detector is insensitive to a global multiplicative change in image values. The edge image of a
will be the same as that of n · a for any .
Note that if the edge image is to be thresholded, it is not necessary to compute the logarithm of a. That is, if
and
c := a •t,
then
Figure 3.7.1 shows the result of applying the Wallis edge detector to the motorcycle image. The logarithm
used for the edge image is base 2 (b = 2).
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
A 3 × 3 subimage b of an image a may be thought of as a vector in . For example, the subimage shown in
Figure 3.8.1 has vector representation
-----------
The mathematical structure of the vector space carries over to the vector space of 3 × 3 subimages in the
obvious way.
Let V denote the vector space of 3 × 3 subimages. An orthogonal basis for V, , that is used for the
Frei-Chen method is the one shown in Figure 3.8.2. The subspace E of V that is spanned by the subimages v1,
v2, v3, and v4 is called the edge subspace of V. The Frei-Chen edge detection method bases its determination
of edge points on the size of the angle between the subimage b and its projection on the edge subspace [11,
12, 13].
Figure 3.8.2 The basis used for Frei-Chen feature detection.
The angle ¸E between b and its projection on the edge subspace is given by
The " operator in the formula above is the familiar dot (or scalar) product defined for vector spaces. The dot
product of b, c V is given by
Small values of ¸E imply a better fit between b and the edge subspace.
For each point (x, y) in the source image a, the Frei-Chen algorithm for edge detection calculates the angle ¸E
between the projection of the 3 × 3 subimage centered at (x, y) and the edge subspace of V. The smaller the
value of ¸E at (x, y) is, the better edge point (x ,y) is deemed to be. After ¸E is calculated for each image point, a
threshold Ä is applied to select points for which ¸E d Ä.
Figure 3.8.3 shows the results of applying the Frei-Chen edge detection algorithm to a source image of variety
of peppers. Thresholds were applied for angles ¸E of 18, 19, 20, 21, and 22°.
which is easier to compute. In this case, a larger value indicates a stronger edge point.
The Frei-Chen method can also be used to detect lines. Subimages v5, v6, v7, and v8 form the basis of the line
subspace of V. The Frei-Chen edge detection method bases its determination of lines on the size of the angle
between the subimage b and its projection on the line subspace. Thresholding for line detection is done using
the statistic
The image that represents the magnitude of the edge gradient is given by
where sk = ak + ak+1 + ak+2 and tk = ak+3 + ak+4 + … + ak+7. The subscripts are evaluated modulo 8. Further
details about this method of directional edge detection are given in [11, 8, 1, 14, 4].
The image d {0, 1, …, 7} X that represents the gradient direction is defined by
The image d · 45° can be used if it is desirable to express gradient angle in degree measure. Figure 3.9.1 show
the magnitude image that results from applying the Kirsch edge detector to the motorcycle image.
and for each 8-neighbor x of a point (i, j), the point x2 defined by x2 = x - (i, j) is a member of M
2.
For the source image let bk := |a•t(k)|. The Kirsch edge magnitude image m is given by
In actual implementation, the algorithm for computing the image m will probably involve the loop
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
For a given source image , let ai = a * mi, where m0, m1, …, m(n/2)-1 denote the edge masks and *
denotes convolution. The edge magnitude image is given by
Let ¸i be the direction associated with the mask mi. Suppose that m(x) = |ak(x)|. The direction image
is given by
As an illustration, the directional edge technique is applied to the infrared image of a causeway across a bay
as shown in Figure 3.10.1. The six masks of Figure 3.10.2 are used to determine one of twelve directions, 0°,
30°, 60°, …, or 330°, to be assigned to each point of the source image. The result of applying the directional
edge detection algorithm is seen in Figure 3.10.3. In the figure, directions are presented as tangent vectors to
the edge rather than gradient vectors (tangent to the edge is orthogonal to the gradient). With this
representation, the bridge and causeway can be characterized by two bands of vectors. The vectors in one
band run along a line in one directions, while those in the other band run along a line in the opposite direction.
The edge image representation is different from the source image representation in that it is a plot of a vector
field. The edge magnitude image was thresholded (see Section 4.2), and vectors were attached to the points
that survived the thresholding process. The vector at each point has length proportional to the edge magnitude
at the point, and points in the direction of the edge.
Note that the result b contains at each point both the maximal edge magnitude and the direction associated
with that magnitude. As formulated above, integers 0 through n - 1 to represent the n directions.
We first restrict our attention to the detection of horizontal edges. Let , where ,
be the source image. For define the image vk. by
The product
will be large only if each of its factors is large. The averaging that takes place over the large neighborhoods
makes the factors with large k less likely to pick up false edges due to noise. The factors from large
neighborhoods contribute to the product by cancelling out false edge readings that may be picked up by the
factors with small k. As (x1, x2) moves farther from an edge point, the factors with small k get smaller. Thus,
the factors in the product for small neighborhoods serve to pinpoint the edge point’s location.
The above scheme can be extended to two dimensions by defining another image h to be
where
The image h is sensitive to vertical edges. To produce an edge detector that is sensitive to both vertical and
horizontal edges, one can simply take the maximum of h and v at each point in X.
Figure 3.11.1 compares the results of applying discrete differencing and the product of the difference of
averages techniques. The source image (labeled 90/10) in the top left corner of the figure is of a circular
region whose pixel values have a 90% probability of being white against a background whose pixel values
have a 10% probability of being white. The corresponding regions of the image in the upper right corner have
probabilities of white pixels equal to 70% and 30%. The center images show the result of taking the
maximum from discrete differencing along horizontal and vertical directions. The images at the bottom of the
figure show the results of the product of the difference of average algorithm. Specifically, the figure’s edge
enhanced image e produced by using the product of the difference of averages is given by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
-----------
and
where (y1, y2) = y. The edge enhanced image e produced by the product of the difference of averages
algorithm is given by the image algebra statement
The templates used the formulation above is designed to be sensitive to vertical and horizontal edges.
Any of the edge detection methods presented in this chapter can be converted into a crack edge detector.
Below, we consider the particular case of discrete differencing.
For a given source image , pixel edge horizontal differencing will yield an image such
that b(i, j) = a(i, j) - a(i, j + 1) for all (i, j) X. Thus if there is a change in value from point (i, j) to point (i, j +
1), the difference in those values is associated with point (i, j).
The horizontal crack edge differencing technique, given , will yield a result image
where W = {(i, r) : (i, r - ), (i, r + ) X}, that is, b’s point set corresponds to the cracks that lie between
horizontally neighboring points in X. The values of b are given by
One can compute the vertical differences as well, and store both horizontal and vertical differences in a single
image with scalar edge values. (This cannot be done using pixel edges since each pixel is associated with both
a horizontal and vertical edge.) That is, given , we can compute where
and
and define by
Another technique that can be used to transform the set of crack edges onto a less sparse subset of is to
employ a spatial transformation that will rotate the crack edge points by angle À/4 and shift and scale
appropriately to map them onto points with integral coordinates. The transformation f : X ’ Z defined by
is just such a function. The effect of applying f to a set of edge points shown in Figure 3.12.1 below.
For w = (x, y, z), let , and define g1 (w) = x/||w||, g2(w) = y/||w||, and g3(w) =
z/||w||.
Suppose a denotes the source image and
denotes a three-dimensional m × m × m neighborhood about the point w0 = (x0, y0, z0). The components of the
surface normal vector of the point w0 are given by
Thus, the surface normal vector of w0 is
where = gi (w - w0) if w M(w0). For example, consider M(w0) a 3 × 3 × 3 domain. The figure
below (Figure 3.13.1) shows the domain of the gi’s.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
For k {0,1, …, l - 1}, let denote the images in a pyramid, where Xk = {(i, j) : 0 d i, j d 2k - 1}.
The source image is . Given al - 1, one can construct, a0, …, al - 2 in the following manner:
The algorithm starts at some intermediate level, k, in the pyramid. This level is typically chosen to be
.
If, for some given threshold T, bk(i, j) > T, then apply the boundary pixel formation technique at points (2i,
2j), (2i + 1, 2j), (2i, 2j + 1), (2i + 1, 2j + 1) in level k + 1. All other points at level k + 1 are assigned value 0.
Boundary formation is repeated until level l - 1 is reached.
where
where
and be defined by
Note that in contrast to the functions g and , g1 and are functions defined on the same pyramid level.
We now compute bm by
Set
el = al - a0
for all l {1, …, 8} and for some positive threshold number T define p : as
Some common K-form neighborhoods are pictured in Figure 3.15.1. We use the notation
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
The value of is a single-digit decimal number encoding the topographical characteristics of the point in
question. Figure 3.15.2 graphically represents each of the possible values for the horizontal 2-form showing a
pixel and the orientation of the surface to left and right of that pixel. (The pixel is represented by a large dot,
and the surface orientation on either side is shown by the orientation of the line segments to the left and right
----------- of the dot.)
The ternary K-forms assign a ternary number to each topographical configuration. The ternary horizontal
2-form is given by
Note that the multiplicative constants are represented base 3, thus these expressions form the ternary number
yielded by concatenating the results of application of p.
The values of other forms are calculated in a similar manner.
Three different ways of using the decimal 2-forms to construct a binary edge image are the following:
(a) An edge in relief is associated with any pixel whose K-form value is 0, 1, or 3.
(b) An edge in depth is associated with any pixel whose K-form value is 5, 7, or 8.
(c) An edge by gradient threshold is associated with a pixel whose K-form is not 0, 4, or 8.
Let the function p be defined as in the mathematical formulation above. The decimal horizontal 2-form is
defined by
If the fit is sufficiently accurate at a given location, an edge is assumed to exist with the same parameters as
the ideal edge model. (See Figure 3.16.1.) An edge is assumed if
Figure 3.16.1 One-dimensional edge fitting.
where Á represents the distance from the center of a test disk D of radius R to the ideal step edge (Á d R), and
¸ denotes the angle of the normal to the edge as shown in Figure 3.16.2.
In Hueckel’s method, both image data and the ideal edge model are expressed as vectors in the vector space of
continuous functions over the unit disk D = {(x, y) : x2 + y2 d 1}. A basis for this vector space — also known
as a Hilbert space — is any complete sequence of orthonormalized continuous functions {hi : i = 0, 1, …}
with domain D. For such a basis, a and s have vector form a = (a1, a2, …) and s = (s1, s2, …), where
and
For application purposes, only a finite number of basis functions can be used. Hueckel truncates the infinite
Hilbert basis to only eight functions, h0, h1, …, h7. This provides for increased computational efficiency.
Furthermore, his sub-basis was chosen so as to have a lowpass filtering effect for inherent noise smoothing.
Although Hueckel provides an explicit formulation for his eight basis functions, their actual derivation has
never been published!
Having expressed the signal a and edge model s in terms of Hilbert vectors, minimization of the mean-square
and
Then
Note that when , then Q(r) = 0. Thus, on the boundary of D each of the
functions hi is 0. In fact, the functions hi intersect the disk D as shown in the following figure:
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
As a final remark, we note that the reason for normalizing r — i.e., instead of
forcing 0 d r d 1 — is to scale the disk D(x0, y0) back to the size of the unit
disk, so that a restricted to D(x0, y0) can be expressed as a vector in the Hilbert space of continuous functions
-----------
on D.
Hueckel’s algorithm. The algorithm proceeds as follows: Choose a digital disk of radius R (R is usually 4, 5,
6, 7, or 8, depending on the application) and a finite number of directions ¸j (j = 1,…, m) — usually ¸ = 0°,
45°, 90°, 135°.
Step 9. Set
and define
Remarks. The edge image e has either pixel value zero or the parameter (b, h, Á, ¸k). Thus, with each edge
pixel we obtain parameters describing the nature of the edge. These parameters could be useful in further
analysis of the edges. If only the existence of an edge pixel is desired, then the algorithm can be shortened by
dropping STEP 8 and defining
Note that in this case we also need only compute a1, …, a7 since a0 is only used in the calculation of b.
It can be shown that the constant K in STEP 9 is the cosine of the angle between the vectors a = (a0, a1, …, a7)
and s = (s0, s1, …, s7). Thus, if the two vectors match exactly, then the angle is zero and K = 1; i.e., a is
already an ideal edge. If K < 0.9, then the fit is not a good fit in the mean-square sense. For this reason the
threshold T is usually chosen in the range 0.9 d T < 1. In addition to thresholding with T, a further test is often
made to determine if the edge constant factor h is greater than a threshold factor.
As a final observation we note that K would have remained the same whether the basis Hi or hi had been used
to compute the components ai.
Here y = (y1, y2) corresponds to the center of the disk D(y) of radius R and x = (x1, x2). For j in {0, 1, …, m},
define by fj(r) = (r,|r|,¸j).
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Let Y be the set of all the edge points in the rectangular point set S. Next, choose point r from Y such that the
distance from r to L(p, q) is maximal. Apply the above procedure to the pairs (p, r) and (r, q).
Note that f is partial in that it is defined only if there is a point on L(x, y) with second coordinate j.
Furthermore, let F(j, w, x, y, z) denote the set ; that is, the set of
points with second coordinate j bounded by L(w, x) and L(y, z).
If card(Y) > 0, define image by b(x) = d(x, L(p, q)). Let r = choice(domain ). Thus, as
shown in Figure 3.17.3, r will be an edge point in set S farthest from line segment L(p, q). The algorithm can
be repeated with points p and r, and with points r and q.
Let be a finite point set and suppose g is a real-valued function of n discrete variables with each
variable a point of X. That is,
then a multistage optimization procedure using the following recursion process can be applied:
1. Define a sequence of functions recursively by setting
for k = 1,…, n - 1.
2. For each y X, determine the point mk+1(y) = xk X such that
that is, find the point xk for which the maximum of the function
is achieved.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title In addition, an a priori determined integer n, which represents the (desired) length of the curve, must be
supplied by the user as input.
For two pixel locations x, y of the image, the directional difference q(x, y) is defined by
-----------
The solution obtained through the multistage optimization process will be the optimal curve
with respect to the FOM function g.
Note that constraint (i) restricts the search to actual edge pixels. Constraint (ii) further restricts the search by
requiring the point xk to be an 8-neighbor of the point xk+1. Constraint (iii), the low-curvature constraint,
narrows the search for adjacent points even further by restricting the search to three points for each point on
the curve. For example, given xk+1 with d(xk+1) = 0, then the point xk preceding xk+1 can only have directional
values 0, 1, or 7. Thus, xk can only be one of the following 8-neighbors of xk+1 shown in Figure 3.18.1.
In the multistage optimization process discussed earlier, the FOM function g had no constraints. In order to
incorporate the constraints (i), (ii), and (iii) into the process, we define two step functions s1 and s2 by
and
Redefining g by
where
for k = 1, …, n - 1, results in a function which has constraints (i), (ii), and (iii) incorporated into its definition
and has the required format for the optimization process. The algorithm reduces now to applying steps 1
through 4 of the optimization process to the function g. The output of this process
represents the optimal curve of length n.
and a function by
curve .
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 4
Thresholding Techniques
----------- 4.1. Introduction
This chapter describes several standard thresholding techniques. Thresholding is one of the simplest and most
widely used image segmentation techniques. The goal of thresholding is to segment an image into regions of
interest and to remove all other regions deemed inessential. The simplest thresholding methods use a single
threshold in order to isolate objects of interest. In many cases, however, no single threshold provides a good
segmentation result over an entire image. In such cases variable and multilevel threshold techniques based on
various statistical measures are used. The material presented in this chapter provides some insight into
different strategies of threshold selection.
The global thresholding procedure is straightforward. Let be the source image and [h, k] be a given
threshold range. The thresholded image b {0, 1} X is given by
for all x X.
Two special cases of this methodology are concerned with the isolation of uniformly high values or uniformly
low values. In the first the thresholded image b will be given by
while in the second case
and
can be used to isolate object of high values and low values, respectively.
4.3. Semithresholding
Semithresholding is a useful variation of global thresholding [1]. Pixels whose values lie within a given
threshold range retain their original values. Pixels with values lying outside of the threshold range are set to 0.
For a source image and a threshold range [h, k], the semithresholded image is given by
for all x X.
Regions of high values can be isolated using
The images semithresholded over the unbounded sets [k, ) and (- , k] are given by
and
respectively.
If appropriate, instead of constructing a result over X, one can construct the subimage c of a containing only
those pixels lying in the threshold range, that is,
Let be the source image, and let k1,...,kn be threshold values satisfying k1 > k2 > ... > kn. These values
partition into n + 1 intervals which are associated with values v1,...,vn+1 in the thresholded result image. A
The exact methodology is as follows. Let be the source image, and let image denote the
region threshold value associated with each point in X, that is, d(x) is the threshold value associated with the
region in which point x lies. The thresholded image b {0, 1} X is defined by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
-----------
For a given image , where X is an m × n grid, the mean ¼ and standard deviation à of a are given by
and
respectively. The threshold level Ä is set at
Let , where . The mean and standard deviation of a are given by the image
algebra statements
and
Figure 4.6.1 show an original IR image of a jeep (top) and three binary images that resulted from thresholding
the original based on its mean and standard deviation for various k1 and k2.
Figure 4.6.1 Thresholding an image based on threshold level Ä = k1¼ + k2Ã for various k1 and k2: (a) k1 = 1,
k2 = 1, Ä = 145, (b) k1 = 1, k2 = 2, Ä = 174, (c) k1 = 1.5, k2 = 1, Ä = 203. At the bottom of the figure is the
histogram of the original image. The threshold levels corresponding to images (a), (b), and (c) have been
marked with arrows.
by maximizing the separability between classes. Between-class variance is used as the measure of separability
between classes. Its definition, which follows below, uses histogram information derived from the source
image. After threshold values have been determined, they can be used as parameters in the multithresholding
algorithm of Section 4.4.
Let and let be the normalized histogram of a (Section 10.10). The pixels of a are to be
partitioned into k classes C0, C1,…,Ck-1 by selecting the Äi’s as stipulated below.
where is the 0th-order cumulative moment of the histogram evaluated up to the Äith level.
The class mean levels are given by
where is the 1st-order cumulative moment of the histogram up to the Äith level and
variance is used as discriminant criterion measure of class separability. This between-class variance is
defined as
or, equivalently,
Note that is a function of Ä1,Ä2,…,Äk-1 (if we substitute the corresponding expressions involving the
Äi’s). The problem of determining the goodness of the Äi’s reduces to optimizing , i.e., to finding the
maximum of or
To the left in Figure 4.7.1 is the blurred image of three nested squares with pixel values 80, 160, and 240. Its
histogram below in Figure 4.7.2 shows the two threshold levels. 118 and 193, that result from maximizing the
variance between three classes. The result of thresholding the blurred image at the levels above is seen in the
image to the right of Figure 4.7.1.
Figure 4.7.1 Blurred image (left) and the result of thresholding it (right).
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The image algebra pseudocode for the threshold finding algorithm for k = 3 is given by the following
formulation:
When the algorithm terminates, the desired thresholds are Ä1 and Ä2.
Figure 4.7.4 Magnitude image after applying directional edge detection to Figure 4.7.3.
Figure 4.7.5 Result of thresholding Figure 4.7.4 at level specified by maximizing between-class variance.
The fraction on the right-hand side of the equality is the midpoint between the gray level of the object and
background. Intuitively, this midpoint is an appealing level for thresholding. Experimentation has shown that
reasonably good performance has been achieved by thresholding at this midpoint level [6]. Thus, the SIS
algorithm sets its threshold Ä equal to
The image to the left in Figure 4.8.1 is that of a square region with pixel value 192 set against a background
of pixel value 64 that has been blurred with a smoothing filter. Its histogram (seen in Figure 4.8.2) is bimodal.
The threshold value 108 selected based on the SIS method divides the two modes of the histogram. The result
of thresholding the blurred image at level 108 is seen in the image to the right of Figure 4.8.1.
Figure 4.8.1 Blurred image (left) and the result of thresholding (right) using the threshold level determined
by the SIS method.
Let be the source image, where . Let s and t be the templates pictured
below.
Let e : = |a•s| ¦ |a•t|. The SIS threshold is given by the image algebra statement
Figure 4.8.2 Histogram of blurred image.
Figure 4.8.6 Histogram of original image with threshold levels marked by arrows.
4.9. References
1 A. Rosenfeld and A. Kak, Digital Picture Processing. New York, NY: Academic Press, 2nd ed.,
1982.
2 N. Hamadani, “Automatic target cueing in IR imagery,” Master’s thesis, Air Force Institute of
Technology, WPAFB, Ohio, Dec. 1981.
3 N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems,
Man, and Cybernetics, vol. SMC-9, pp. 62-66, Jan. 1979.
4 P. K. Sahoo, S. Soltani, A. C. Wong, and Y. C. Chen, “A survey of thresholding techniques,”
Computer Vision, Graphics, and Image Processing, vol. 41, pp. 233-260, 1988.
5 J. Kittler, J. Illingworth, and J. Foglein, “Threshold selection based on a simple image statistic,”
Computer Vision, Graphics, and Image Processing, vol. 30, pp. 125-147, 1985.
6 S. U. Lee, S. Y. Chung, and R. H. Park, “A comparative performance study of several global
thresholding techniques for segmentation,” Computer Vision, Graphics, and Image Processing, vol. 52,
pp. 171-190, 1990.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 5
Thining and Skeletonizing
----------- 5.1. Introduction
Thick objects in a discrete binary image are often reduced to thinner representations called skeletons, which
are similar to stick figures. Most skeletonizing algorithms iteratively erode the contours in a binary image
until a thin skeleton or single pixel remain. These algorithms typically examine the neighborhood of each
contour pixel and identify those pixels that can be deleted and those that can be classified as skeletal pixels.
Thinning and skeletonizing algorithms have been used extensively for processing thresholded images, data
reduction, pattern recognition, and counting and labeling of connected regions. Another thinning application
is edge thinning which is an essential step in the description of objects where boundary information is vital.
Algorithms given in Chapter 3 describe how various transforms and operations applied to digital images yield
primitive edge elements. These edge detection operations typically produce a number of undesired artifacts
including parallel edge pixels which result in thick edges. The aim of edge thinning is to remove the inherent
edge broadening in the gradient image without destroying the edge continuity of the image.
where OB(R) is the outside boundary of region R. The trade-off for connectivity is higher computational cost
and the possibility that the thinned image may not be reduced as much. Definitions for the various boundaries
of a Boolean image and boundary detection algorithms can be found in Section 3.2.
Figure 5.2.1 below shows the disconnected and connected skeletons of the SR71 superimposed over the
original image.
When the loop terminates the thinned image will be contained in image variable a. The correspondences
between the images in the code above and the mathematical formulation are
If connected components are desired then the last statement in the loop should be changed to
Let , and let A denote the support of a. The medial axis is the set, ,
consisting of those points x for which there exists a ball of radius rx, centered at x, that is contained in A and
intersects the boundary of A in at least two distinct points. The dotted line (center dot for the circle) of Figure
5.3.1 represents the medial axis of some simple regions.
The medial axis transform is unique. The original image can be reconstructed from its medial axis transform.
The medial axis transform evolved through Blum’s work on animal vision systems [1-10]. His interest
involved how animal vision systems extract geometric shape information. There exists a wide range of
applications that finds a minimal representation of an image useful [7]. The medial axis transform is
especially useful for image compression since reconstruction of an image from its medial axis transform is
possible.
Let Br (x) denote the closed ball of radius r and centered at x. The medial axis of region A is the set
The reconstruction of the domain of the original image a in terms of the medial axis transform m is given by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The result of applying the medial axis transform to the SR71 can be seen in Figure 5.3.3. The medial axis
transformation is superimposed over the original image of the SR71 which is represented by dots.
Hexadecimal numbering is used for the display of the medial axis transform.
Figure 5.3.3 MAT superimposed over the SR71.
The original image a can be recovered from the medial axis transform m using the image algebra expression
The reconstruction in this case will not be exact. Applying the algorithm to a small square block will show
how the reconstruction fails to be exact.
Let a be a binary image defined on with for feature pixels and 0 for non-feature pixels. The SR71
of Figure 5.4.1 will serve as our source image for illustrations purposes. An asterisk represents a pixel value
of . Non-feature pixel values are not displayed.
The distance transform of a is a gray level image such that pixel value b(i, j) is the distance
between the pixel location (i, j) and the nearest non-feature pixels. The distance can be measured in terms of
Euclidean distance, city block, or chess block, etc., subject to the application’s requirements. The result of
applying the city block distance transform to the image of Figure 5.4.1 can be seen in Figure 5.4.2. Note that
hexadecimal numbering is used in the figure.
The distance skeleton c of the distance transform b is the image whose nonzero pixel values consist of local
maxima of b. Figure 5.4.3 represents the distance skeleton extracted from Figure 5.4.2 and superimposed over
the original SR71 image.
The restoration transform is used to reconstruct the distance transform b from the distance skeleton c.
Let , and a {0, } X be the source image. Note that the templates , and u may be
defined according to the specific distance measure being used. The templates used for our city block distance
example are defined pictorially below in Figure 5.4.4.
Distance transform
The distance transform b of a, computed recursively, is given by
where is the forward scanning order defined on X and is the backward scanning order defined on X.
The parallel image algebra formulation for the distance transform is given by
Figure 5.4.4 Templates used for city block distance image algebra formulation.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Let B(p) denote the number of nonzero 8-neighbors of p and let A(p) denote the number of zero-to-one
transitions in the ordered sequence p1, p2, …, p8, p1. If p is a contour point and its 8-neighbors satisfy the
following four conditions listed below, then p will be removed on the first subiteration. That is, the new value
associated with p will be 0. The conditions for boundary pixel removal are
(a) 2 d B(p) d 6
(b) A(p) = 1
(c) p1 · p3 · p5 = 0
(d) p3 · p5 · p7 = 0
In the example shown in Figure 5.5.2, p is not eligible for deletion during the first subiteration. With this
configuration B(p) = 5 and A(p) = 3.
Iteration continues until either subiteration produces no change in the image. Figure 5.5.3 shows the
Zhang-Suen skeleton superimposed over the original image of the SR71.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
-----------
and
S2 = {3, 6, 7, 12, 14, 15, 24, 28, 30, 31, 48, 56, 60, 62
63, 96, 112, 120, 124, 126, 129, 131, 135, 143,
159, 192, 193, 195, 224, 225, 227, 240, 248, 252}
to identify points that satisfy the conditions for deletion. S1 targets points that satisfy (a) to (d) for the first
subiteration and S2 targets points that satisfy (a), (b), (c2), and (d2) for the second subiteration.
respectively.
Figure 5.6.1 Original Zhang-Suen transform (left) and modified Zhang-Suen transform (right).
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Figure 5.7.1 Directions and their 8-neighborhood normals about (i, j).
An edge element is deemed to exist at the point (i, j) if any one of the following sets of conditions hold:
(1) The edge magnitudes of points on the normal to d(i, j) are both zero.
(2) The edge directions of the points on the normal to d(i, j) are both within 30° of d(i, j), and m(i, j) is
greater than the magnitudes of its neighbors on the normal.
(3) One neighbor on the normal has direction within 30° of d(i, j), the other neighbor on the normal has
direction within 30° of d(i, j) + 180°, and m(i, j) is greater than the magnitude of the former of its two
neighbors on the normal.
The result of apply the edge thinning algorithm to Figure 3.10.3 can be seen in Figure 5.7.2.
Let be an image on the point set X, where is the edge magnitude image, and
is the edge direction image.
The parameterized template t(i) has support in the three-by-three 8-neighborhood about its target point. The
ith neighbor (ordered clockwise starting from bottom center) has value 1. All other neighbors have value 0.
Figure 5.7.3 provides an illustration for the case i = 4. The small number in the upper right-hand corner of the
template cell indicates the point’s position in the neighborhood ordering.
The function g : {0, 30, …, 330} ’ {0, 1, …, 7} defined by
is used to defined the image i = g(d) {0, …, 7} x. The pixel value i(x) is equal to the positional value of one
of the 8-neighbors of x that is on the normal to d(x).
The function f is used to discriminate between points that are to remain as edge points and those that are to be
deleted (have their magnitude set to 0). It is defined by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 6
Connected Component Algorithms
----------- 6.1. Introduction
A wide variety of techniques employed in computer vision reduce gray level images to binary images. These
binary images usually contain only objects deemed interesting and worthy of further analysis. Objects of
interest are analyzed by computing various geometric properties such as size, shape, or position. Before such
an analysis is done, it is often necessary to remove various undesirable artifacts such as feelers from
components through pruning processes. Additionally, it may be desirable to label the objects so that each
object can be analyzed separately and its properties listed under its’ label. As such, component labeling can be
considered a fundamental segmentation technique.
The labeling of connected components also provides for the number of components in an image. However,
there are often faster methods for determining the number of components. For example, if components
contain small holes, the holes can be rapidly filled. An application of the Euler characteristic to an image
containing components with no holes provides one fast method for counting components. This chapter
provides some standard labeling algorithms as well as examples of pruning, hole filling, and component
counting algorithms.
The set of interest for component labeling of an image, a {0, 1} X, is the set of points that have
pixel value 1. This set is partitioned into disjoint connected subsets. The partitioning is represented by an
image b in which all points of Y that lie in the same connected component have the same pixel value. Distinct
pixel values are assigned to distinct connected components.
Either 4- or 8-connectivity can be used for component labeling. For example, let the image to the left in
Figure 6.2.1 be the binary source image (pixel values equal to 1 are displayed with black). The image in the
center represents the labeling of 4-components. The image to the right represents the 8-components. Different
colors (or gray levels) are often used to distinguish connected components. This is why component labeling is
also referred to as blob coloring. From the different gray levels in Figure 6.2.1, we see that the source image
has five 4-connected components and one 8-connected component.
Figure 6.2.1 Original image, labeled 4-component image, and labeled 8-component image.
The algorithm starts by assigning each black pixel a unique label and uses c to keep track of component labels
as it proceeds. The neighborhood N has one of the two configurations pictured below. The configuration to
the left is used for 4-connectivity and the configuration to the right is used for 8-connectivity.
When the loop of the pseudocode below terminates, b will be the image that contains connected component
information. That is, points of X that have the same pixel value in the image b lie in the same connected
component of X.
Initially, each feature pixel of a has a corresponding unique label in the image c. The algorithm works by
propagating the maximum label within a connected component to every point in the connected component.
The components of a are labeled using the image algebra pseudocode that follows. When the loop terminates
the image with the labeled components will be contained in image variable b.
Note that this formulation propagates the minimum value within a component to every point in the
component. The variant template t as defined above is used to label 4-connected components. It is easy to see
how the transition from invariant template to variant has been made. The same transition can be applied to the
invariant template used for labeling 8-components.
Variant templates are not as efficient to implement as invariant templates. However, in this implementation of
component labeling there is one less image multiplication for each iteration of the while loop.
Although the above algorithms are simple, their computational cost is high. The number of iterations is
directly proportional to mn. A faster alternate labeling algorithm [2] proceeds in two phases. The number of
iterations is only directly proportional to m + n. However, the price for decreasing computational complexity
is an increase in space complexity.
In the first phase, the alternate algorithm applies the shrinking operation developed in Section 6.4 to the
source image m + n times. The first phase results in m + n binary images a0, a1, … , am+n-1. Each ak represents
an intermediate image in which connected components are shrinking toward a unique isolated pixel. This
isolated pixel is the top left corner of the component’s bounding rectangle, thus it may not be in the connected
component. The image am+n-1 consists of isolated black pixels.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title In the second phase, the faster algorithm assigns labels to each of the isolated black pixels of the image am+n-1.
These labels are then propagated to the pixels connected to them in am+n-2 by applying a label propagating
operation, then to am+n-3, and so on until a0 is labeled. In the process of propagating labels, new isolated black
pixels may be encountered in the intermediate images. When this occurs, the isolated pixels are assigned
-----------
unique labels and label propagation continues.
Note that the label propagating operations are applied to the images in the reverse order they are generated by
the shrinking operations. More than one connected component may be shrunk to the same isolated pixel, but
at different iterations. A unique label assigned to an isolated pixel should include the iteration number at
which it is generated.
The template s used for image shrinking and the neighborhood function P used for label propagation are
defined in following figure. The faster algorithm for labeling
8-components is as follows:
When the last loop terminates the labeled image will be contain in the variable c. If 4-component labeling is
preferred, the characteristic function in line 4 should be replaced with
Figure 6.3.1 A binary image, its 4-labeled image (center) and 8-labeled image (right).
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
One of four windows may be chosen as the shrinking pattern. Each is distinguished by the direction it
compresses the image.
Shrinking toward top right —
Each iteration of the algorithm consists of applying, in parallel, the selected shrinking window on every
element of the image. Iteration continues until the original image has been reduced to the zero image. After
each iteration the number of isolated points is counted and added to a running total of isolated points. Each
isolated point corresponds to an 8-connected component of the original image.
This shrinking process is illustrated in the series (in left to right and top to bottom order) of images in Figure
6.4.1. The original image (in the top left corner) consists of five 8-connected patterns. By the fifth iteration the
component in the upper left corner has been reduced to an isolated pixel. Then the isolated pixel is counted
and removed in the following iteration. This process continues until each component has been reduced to an
isolated pixel, counted, and then removed.
The image variable a is initialized with the original image. The integer variable p is initially set to zero. The
algorithm consists of executing the following loop until the image variable a is reduced to the zero image.
When the algorithm terminates, the value of p will represent the number of 8-connected components of the
original image.
The results of applying this pruning algorithm are shown in Figure 6.5.1.
Figure 6.5.1 The source image a is shown at the left and the pruned version of a on the right.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
When the loop terminates the “filled” image will be contained in the image variable c. The second method
fills the whole domain of the source image with 1’s. Then, starting at the edges of the domain of the image,
extraneous 1’s are peeled away until the exterior boundary of the objects in the source image are reached.
Depending on how is implemented, it may be necessary to add the line
immediately after the statement that initializes c to 1. This is because template t needs to encounter a 0 before
the peeling process can begin.
6.7. References
1 D. Ballard and C. Brown, Computer Vision. Englewood Cliffs, NJ: Prentice-Hall, 1982.
2 R. Cypher, J. Sanz, and L. Snyder, “Algorithms for image component labeling on SIMD mesh
connected computers,” IEEE Transactions on Computers, vol. 39, no. 2, pp. 276-281, 1990.
3 H. Alnuweiri and V. Prasanna, “Fast image labeling using local operators on mesh-connected
computers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-13, no. 2,
pp. 202-207, 1991.
4 H. Alnuweiri and V. Prasanna, “Parallel architectures and algorithms for image component labeling,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-14, no. 10, pp.
1014-1034, 1992.
5 H. Shi and G. Ritter, “Image component labeling using local operators,” in Image Algebra and
Morphological Image Processing IV, vol. 2030 of Proceedings of SPIE, (San Diego, CA), pp. 303-314,
July 1993.
6 S. Levialdi, “On shrinking binary picture patterns,” Communications of the ACM, vol. 15, no. 1, pp.
7-10, 1972.
7 N. Hamadani, “Automatic target cueing in IR imagery,” Master’s thesis, Air Force Institute of
Technology, WPAFB, Ohio, Dec. 1981.
8 G. Ritter, “Boundary completion and hole filling,” Tech. Rep. TR 87-05, CCVR, University of
Florida, Gainesville, 1987.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 7
Morphological Transforms and Techniques
----------- 7.1. Introduction
Mathematical morphology is that part of digital image processing that is concerned with image filtering and
geometric analysis by structuring elements. It grew out of the early work of Minkowski and Hadwiger on
geometric measure theory and integral geometry [1, 2, 3], and entered the modern era through the work of
Matheron and Serra of the Ecole des Mines in Fontainebleau, France [4, 5]. Matheron and Serra not only
formulated the modern concepts of morphological image transformations, but also designed and built the
Texture Analyzer System, a parallel image processing architecture based on morphological operations [6]. In
the U.S., research into mathematical morphology began with Sternberg at ERIM. Serra and Sternberg were
the first to unify morphological concepts and methods into a coherent algebraic theory specifically designed
for image processing and image analysis. Sternberg was also the first to use the term image algebra [7, 8].
Initially the main use of mathematical morphology was to describe Boolean image processing in the plane,
but Sternberg and Serra extended the concepts to include gray valued images using the cumbersome notion of
an umbra. During this time a divergence of definition of the basic operations of dilation and erosion occurred,
with Sternberg adhering to Minkowski’s original definition. Sternberg’s definitions have been used more
regularly in the literature, and, in fact, are used by Serra in his book on mathematical morphology[5].
Since those early days, morphological operations and techniques have been applied from low-level, to
intermediate, to high-level vision problems. Among some recent research papers on morphological image
processing are Crimmins and Brown [9], Haralick, et al. [10, 11], and Maragos and Schafer [12, 13, 14]. The
rigorous mathematical foundation of morphology in terms of lattice algebra was independently established by
Davidson and Heijmans [15, 16, 17]. Davidson’s work differs from that of Heijmans’ in that the foundation
provided by Davidson is more general and extends classical morphology by allowing for shift variant
structuring elements. Furthermore, Davidson’s work establishes a connection between morphology, minimax
algebra, and a subalgebra of image algebra.
7.2. Basic Morphological Operations: Boolean Dilations and Erosions
Dilation and erosion are the two fundamental operations that define the algebra of mathematical morphology.
These two operations can be applied in different combinations in order to obtain more sophisticated
operations. Noise removal in binary images provides one simple application example of the operations of
dilation and erosions (Section 7.4).
The language of Boolean morphology is that of set theory. Those points in a set being morphologically
transformed are considered the selected set of points, and those in the complement set are considered to be not
selected. In Boolean (binary) images the set of pixels is the foreground and the set of pixels not selected is the
background. The selected set of pixels is viewed as a set in Euclidean 2-space. For example, the set of all
black pixels in a Boolean image constitutes a complete description of the image and is viewed as a set in
Euclidean 2-space.
A dilation is a morphological transformation that combines two sets by using vector addition of set elements.
In particular, a dilation of the set of black pixels in a binary image by another set (usually containing the
origin), say B, is the set of all points obtained by adding the points of B to the points in the underlying point
set of the black pixels. An erosion can be obtained by dilating the complement of the black pixels and then
taking the complement of the resulting point set.
Dilations and erosions are based on the two classical operations of Minkowski addition and Minkowski
subtraction of integral geometry. For any two sets and , Minkowski addition is defined
as
where B* = {-b : b B} and A2 = {x : x A}; i.e., B* denotes the reflection of B across the origin 0 =
(0, 0, …, 0) , while A2 denotes the complement of A. Here we have used the original notation employed
in Hadwiger’s book [3].
Defining Ab = {a + b : a A}, one also obtains the relations
and
where Bp = {b + p : b B}. It is these last two equations that makes morphology so appealing to many
researchers. Equation 7.2.1 is the basis of morphological dilations, while Equation 7.2.2 provides the basis for
morphological erosions. Suppose B contains the origin 0. Then Equation 7.2.1 says that A × B is the set of all
points p such that the translate of B by the vector p intersects A. Figures 7.2.1 and 7.2.2 illustrate this
situation.
Similarly, the erosion of the image a {0, 1} X by B is the binary image b {0, 1} X defined by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The dilation and erosion of images as defined by Equations 7.2.3 and 7.2.4 can be realized by using max and
min operations or, equivalently, by using OR and AND operations. In particular, it follows from Equations
7.2.3 and 7.2.1 that the dilated image b obtained from a is given by
-----------
Similarly, the eroded image b obtained from a is given by
The image algebra formulation of the dilation of the image a by the structuring element B is given by
The image algebra equivalent of the erosion of a by the structuring element B is given by
Replacing a by its Boolean complement ac in the above formulation for dilation (erosion) will dilate (erode)
the complement of the support of a.
The image algebra formulation of the dilation of the image a by the structuring element B is given by
The image algebra equivalent of the erosion of a by the structuring element B is given by
With a minor change in template definition, binary dilations and erosions can just as well be accomplished
using the lattice convolution operations and , respectively.
These equations constitute the basic laws that govern the algebra of mathematical morphology and provide
the fundamental tools for geometric shape analysis of images.
The current notation used in morphology for a dilation of A by B is A • B, while an erosion of A by B is
denoted by A B. In order to avoid confusion with the linear image-template product •, we use Hadwiger’s
notation [3] in order to describe morphological image transforms. In addition, we use the now more
commonly employed definitions of Sternberg for dilation and erosion. A comparison of the different notation
and definitions used to describe erosions and dilations is provided by Table 7.2.1.
Table 7.2.1 Notation and Definitions Used to Describe Dilations and Erosions
7.3. Opening and Closing
Dilations and erosions are usually employed in pairs; a dilation of an image is usually followed by an erosion
of the dilated result or vice versa. In either case, the result of successively applied dilations and erosions
results in the elimination of specific image detail smaller than the structuring element without the global
geometric distortion of unsuppressed features.
An opening of an image is obtained by first eroding the image with a structuring element and then dilating the
result using the same structuring element. The closing of an image is obtained by first dilating the image with
a structuring element and then eroding the result using the same structuring element. The next section shows
that opening and closing provide a particularly simple mechanism for shape filtering.
The operations of opening and closing are idempotent; their reapplication effects no further changes to the
previously transformed results. In this sense openings and closings are to morphology what orthogonal
projections are to linear algebra. An orthogonal projection operator is idempotent and selects the part of a
vector that lies in a given subspace. Similarly, opening and closing provide the means by which given
subshapes or supershapes of a complex geometric shape can be selected.
The opening of A by B is denoted by A B and defined as
The image algebra formulation of the opening of the image a by the structuring element B is given by
The image algebra equivalent of the closing of a by the structuring element B is given by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Consider the image shown in Figure 7.4.1. Let A denote the set of all black pixels. Choosing the structuring
element B shown in Figure 7.4.2, the opening of A by B
----------- removes all the pepper noise (small black areas) from the input images. Doing a closing on A B,
closes the small white holes (salt noise) and results in the image shown in Figure 7.4.3.
The image b derived from a using the morphological salt and pepper noise removal technique is given by
Since a dilation can be obtained from an erosion via the duality A × B = (A2/B*)2, it follows that a dilation is
also a special case of the HMT.
The image algebra equivalent of the hit-and-miss transform applied to the image a using the structuring
element B is given by
Alternate Image Algebra Formulation
Let the neighborhoods N, M : X ’ 2X be defined by
and
Then
Figure 7.5.1 Hit-and-miss transform used to locate square regions. Region A (corresponding to source image
a) is transformed to region B (image b) with the morphological hit-and-miss transform using structuring
element B = (D, E) by A = A B. In image algebra notation, c = a t with t as shown in Figure 7.5.2.
Figure 7.5.2 Template used for the image algebra formulation of the hit-and-miss tranform designed to locate
square regions.
Note that the notion of an unbounded set is exhibited in this definition; the value of xn+1 can approach -.
Since , we can dilate by any other subset of . This observation provides the
clue for dilation of gray-valued images. In general, the dilation of a function by a function g : X ’
, where X 4 , is defined through the dilation of their umbras (f) × (g) as follows. Let
and define a function .
We now define f × g a d. The erosion of f by g is defined as the function f/g a e , where and
.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Openings and closings of f by g are defined in the same manner as in the binary case. Specifically, the
opening of f by g is defined as f g = (f/g) × g while the closing of f by g is defined as f • g = (f × g)/g.
When calculating the new functions d = f × g and e = f/g, the following formulas for two-dimensional
dilations and erosions are actually being used:
-----------
Note that the condition y - x B is equivalent to x By, where By denotes the translation of B by the vector y.
The image algebra equivalents of a gray scale dilation and a gray scale erosion of a by the structuring element
g are now given by
and
respectively. Here
That the algebra is more general than mathematical morphology should come as no surprise as templates
are more general objects than structuring elements. Since structuring elements correspond to translation
invariant templates, morphology lacks the ability to implement translation variant lattice transforms
effectively. In order to implement such transforms effectively, morphology needs to be extended to include
the notion of translation variant structuring elements. Of course, this extension is already a part of image
algebra.
It may be obvious that the generation of the surface b is equivalent to an erosion. To realize this equivalence,
let s : X ’ be defined by , where . Obviously, the graph of s
corresponds to the upper surface (upper hemisphere) of a ball s of radius r with center at the origin of .
Using Equation 7.6.1, it is easy to show that b = a/s.
Next, let s roll in the surface b such that the center of s is always a point of b and such that every point of b
gets hit by the center of s. Then the top point of s traces another surface c above b. If a is flat or smooth, with
local curvature never less than r, then c := a. However, if a contains crevasses into which s does not fit — that
is, locations at which s is not tangent to a or points of a with curvature less than r, etc. — then c ` a. Figure
7.7.3 illustrates this situation. It should also be clear by now that c corresponds to a dilation of b by s.
Therefore, c is the opening of a by s, namely
Hence, c d a.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Let denote the digital source image and s a structuring element represented by digital hemisphere of
some desired radius r about the origin. Note that the support of s is a digital disk S of radius r. The rolling ball
algorithm is given by the transformation a ’ d which is defined by
-----------
The image algebra formulation of the rolling ball algorithm is now given by
7.8. References
1 H. Minkowski, “Volumen und oberflache,” Mathematische Annalen, vol. 57, pp. 447-495, 1903.
2 H. Minkowski, Gesammelte Abhandlungen. Leipzig-Berlin: Teubner Verlag, 1911.
3 H. Hadwiger, Vorlesungen Über Inhalt, OberflSche und Isoperimetrie. Berlin: Springer-Verlag,
1957.
4 G. Matheron, Random Sets and Integral Geometry. New York: Wiley, 1975.
5 J. Serra, Image Analysis and Mathematical Morphology. London: Academic Press, 1982.
6 J. Klein and J. Serra, “The texture analyzer,” Journal of Microscopy, vol. 95, 1972.
7 S. Sternberg, “Biomedical image processing,” Computer, vol. 16, Jan. 1983.
8 S. Sternberg, “Overview of image algebra and related issues,” in Integrated Technology for Parallel
Image Processing (S. Levialdi, ed.), London: Academic Press, 1985.
9 T. Crimmins and W. Brown, “Image algebra and automatic shape recognition,” IEEE Transactions
on Aerospace and Electronic Systems, vol. AES-21, pp. 60-69, Jan. 1985.
10 R. Haralick, L. Shapiro, and J. Lee, “Morphological edge detection,” IEEE Journal of Robotics and
Automation, vol. RA-3, pp. 142-157, Apr. 1987.
11 R. Haralick, S. Sternberg, and X. Zhuang, “Image analysis using mathematical morphology: Part I,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 532-550, July 1987.
12 P. Maragos and R. Schafer, “Morphological filters Part I: Their set-theoretic analysis and relations
to linear shift-invariant filters,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.
ASSP-35, pp. 1153-1169, Aug. 1987.
13 P. Maragos and R. Schafer, “Morphological filters Part II : Their relations to median, order-statistic,
and stack filters,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-35, pp.
1170-1184, Aug. 1987.
14 P. Maragos, A Unified Theory of Translation-Invariant Systems with Applications to Morphological
Analysis and Coding of Images. Ph.D. dissertation, Georgia Institute of Technology, Atlanta, 1985.
15 J. Davidson, Lattice Structures in the Image Algebra and Applications to Image Processing. PhD
thesis, University of Florida, Gainesville, FL, 1989.
16 J. Davidson, “Foundation and applications of lattice transforms in image processing,” in Advances
in Electronics and Electron Physics (P. Hawkes, ed.), vol. 84, pp. 61-130, New York, NY: Academic
Press, 1992.
17 H. Heijmans, “Theoretical aspects of gray-level morphology,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 13(6), pp. 568-582, 1991.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 8
Linear Image Transforms
----------- 8.1. Introduction
A large class of image processing transformations is linear in nature; an output image is formed from linear
combinations of pixels of an input image. Such transforms include convolutions, correlations, and unitary
transforms. Applications of linear transforms in image processing are numerous. Linear transforms have been
utilized to enhance images and to extract various features from images. For example, the Fourier transform is
used in highpass and lowpass filtering (Chapter 2) as well as in texture analysis. Another application is image
coding in which bandwidth reduction is achieved by deleting low-magnitude transform coefficients. In this
chapter we provide some typical examples of linear transforms and their reformulations in the language of
image algebra.
The inner integral (enclosed by square brackets) is the Fourier transform of f. It is a function of u alone.
Replacing the integral definition of the Fourier transform of f by we get
Euler’s formula allows e2Àiux to be expressed as cos(2Àux) + i sin(2Àux). Thus, e2Àiux is a sum of a real and
complex sinusoidal function of frequency u. The integral is the continuous analog of summing over all
frequencies u. The Fourier transform evaluated at u, , can be viewed as a weight applied to the real and
complex sinusoidal functions at frequency u in a continuous summation.
Combining all these observations, the original function f(x) is seen as a continuous weighted sum of
sinusoidal functions. The weight applied to the real and complex sinusoidal functions of frequency u is given
by the Fourier transform evaluated at u. This explains the use of the term “frequency” as an adjective for the
domain of the Fourier transform of a function.
For image processing, an image can be mapped into its frequency domain representation via the Fourier
transform. In the frequency domain representation the weights assigned to the sinusoidal components of the
image become accessible for manipulation. After the image has been processed in the frequency domain
representation, the representation of the enhanced image in the spatial domain can be recovered using the
inverse Fourier transform.
The discrete equivalent of the one-dimensional continuous Fourier transform pair is given by
and
where and . In digital signal processing, a is usually viewed as having been obtained
from a continuous function by sampling f at some finite number of uniformly spaced points
and setting a(k) = f(xk).
Figure 8.2.1 shows the image of a jet and its Fourier transform image. The Fourier transform image is
complex valued and, therefore, difficult to display. The value at each point in Figure 8.2.1 is actually the
magnitude of its corresponding complex pixel value in the Fourier transform image. Figure 8.2.1 does show a
full period of the transform, however the origin of the transform does not appear at the center of the display. A
representation of one full period of the transform image with its origin shifted to the center of the display can
be achieved by multiplying each point a(x,y) by (-1)x+y before applying the transform. The jet’s Fourier
transform image shown with the origin at the center of the display is seen in Figure 8.2.2.
Figure 8.2.1 The image of a jet (left) and its Fourier transform image (right).
The template f is called the one-dimensional Fourier template. It follows directly from the definition of f that
f2 = f and (f*)2 = f* where f* denotes the complex conjugate of f defined by
. Hence, the equivalent of the discrete inverse Fourier transform of â is
given by
The image algebra equivalent formulation of the two-dimensional discrete Fourier transform pair is given by
and
and
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Periodicity and symmetry are the key ingredients for understanding the need for centering the Fourier
transform for interpretation purposes. The periodicity property indicates that has period of length m in
the u direction and of length n in the v direction, while the symmetry shows that the magnitude is centered
about the origin. This is shown in Figure 8.2.1 where the origin (0,0) is located at the upper left hand corner of
the image. Since the discrete Fourier transform has been formulated for values of u in the interval [0, m -1]
-----------
and values v in the interval [0, n - 1], the result of this formulation yields two half periods in these intervals
that are back to back. This is illustrated in Figure 8.3.1 (a), which shows a one-dimensional slice along the
u-axis of the magnitude function . Therefore, in order to display one full period in both the u and v
directions, it is necessary to shift the origin to the midpoint and add points that shift to the outside of
the image domain back into the image domain using modular arithmetic. Figures 8.2.2 and 8.3.1 illustrate the
effect of this centering method. The reason for using modular arithmetic is that, in contrast to Figure 8.3.1 (a),
when Fourier transforming where , there is no information available outside the
intervals [0, m - 1] and [0, n - 1]. Thus, in order to display the full periodicity centered at the midpoint in the
interval [0, m - 1], the intervals and need to be flipped and their ends glued together. The
same gluing procedure needs to be performed in the v direction. This amounts to doing modular arithmetic.
Figure 8.3.2 illustrates this process.
Figure 8.3.1 Periodicity of the Fourier transform. Figure (a) shows the two half periods, one on the interval
and the other on the interval . Figure (b) shows the full period of the shifted Fourier transform on
the interval [0, m].
Note that the centering function is its own inverse since center (center(p)) = p. Therefore, if denotes the
centered version of â, then given by
Figure 8.3.2 illustrates the mapping of points when applying center to an array X.
Figure 8.3.2 The centering function applied to a 16 × 16 array. The different shadings indicate the mapping
of points under the center function.
Note that mid(X) does not correspond to the midpoint of X, but gives the midpoint of the set X - min(X). The
actual midpoint of X is given by center(min(X)). This is illustrated by the example shown in Figure 8.3.3. In
this example, min(X) = (4.5), max(X) = (13.18), and mid(X) = (5, 7). Also, if , then min(X) =
(0, 0), max(X) = (m - 1, n - 1), and . This shows that the general center function reduces to
the previous center function.
Figure 8.3.3 The shift X - min(X) of a 10 × 14 array X. The point mid(X) = (5, 7) is shown in black.
If X is a square array of form then centering can be accomplished by multiplying the image a
by the image b defined by b(x, y) = (-1)x+y prior to taking the Fourier transform. This follows from the simple
fact expressed by the equation
which holds whenever . Thus, in this case we can compute by using the formulation
In this form it is easy to see that O (n2) complex adds and multiplications are required to compute the Fourier
transform using the formulation of Section 8.2. For each 0 d j d n - 1, n complex multiplications of a(k) by
are required for a total of n2 complex multiplications. For each 0 d j d n - 1 the sum
requires n - 1 complex adds for a total of n2 - n complex adds. The complex arithmetic involved
in the computation of the terms does not enter into the measure of computational complexity that is
optimized by the fast Fourier algorithm.
The number of complex adds and multiplications can be reduced to O (n log2 n) by incorporating the
Cooley-Tukey radix-2 fast Fourier algorithm into the image algebra formulation of the Fourier transform. It is
assumed for the FFT optimization that n is a power of 2.
The separability of the Fourier transform can be exploited to obtain a similar computational savings on higher
dimensional images. Separability allows the Fourier transform to be computed over an image as a succession
of one-dimensional Fourier transforms along each dimension of the image. Separability will be discussed in
more detail later.
For the mathematical foundation of the Cooley-Tukey fast Fourier algorithm the reader is referred to Cooley
et al. [1, 2, 3]. A detailed exposition on the integration of the Cooley-Tukey algorithm into the image algebra
formulation of the FFT can be found in Ritter [4]. The separability of the Fourier transform and the definition
of the permutation function used in the image algebra formulation of the FFT will be discussed
here.
The separability of the Fourier transform is key to the decomposition of the two-dimensional FFT into two
successive one-dimensional FFTs. The structure of the image algebra formulation of the two-dimensional
FFT will reflect the utilization of separability.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
-----------
The double summation can be rewritten as
or
where
For each 0 d k d n - 1, â(u, k) is the one-dimensional Fourier transform of the kth column of a. For each 0 d u
d m - 1,
For n = 2k, the permutation is the function that reverses the bit order of the k-bit binary
representation of its input, and outputs the decimal representation of the value of the reversed bit order. For
example, the table below represents . The permutation function will be used in the image
algebra formulation of the FFT. The algorithm used to compute the permutation function Án is presented next.
Notice that from the definition of the template t there are at most 2 values of l in the support of t for every 0 d
j 8804 n - 1. Thus, only 2n complex multiplications and n complex adds are required to evaluate a •t(2i-1).
Since the convolution a •t(2i-1) is contained within a loop that consists of log2n iterations, there are O(n log n)
complex adds and multiplications in the image algebra formulation of the FFT.
Two-Dimensional FFT
In our earlier discussion of the separability of the Fourier transform, we noted that the two-dimensional DFT
can be computed in two steps by successive applications of the one-dimensional DFT; first along each row
followed by a one-dimensional DFT along each column. Thus, to obtain a fast Fourier transform for
two-dimensional images, we need to apply the image algebra formulation of the one-dimensional FFT in
simple succession. However, in order to perform the operations and a •t(p) specified by the
algorithm, it becomes necessary to extend the function Án and the template t(p) to two-dimensional arrays.
For this purpose, suppose that , where n = 2h and n = 2k, and assume without loss of
generality that n d m.
Let P = {2i : i = 0,1,...,log2m - 1} and for p P define the parameterized row template
where . Note that for each p P, t(p) is a row template which is essentially identical
to the template used in the one-dimensional case.
The permutation Á is extended to a function r : X ’ X in a similar fashion by restricting its actions to the rows
of X. In particular, define
With the definitions of r and t completed, we are now in a position to specify the two-dimensional radix-2
FFT in terms of image algebra notation.
If X, rm and t are specified as above and , then the following algorithm computes the two-dimensional
fast Fourier transform of a.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The alternate formulation for the fast Fourier transform using spatial transforms is given by
Note that the use of the transpose a2 before and after the second loop in the code above is used for notational
simplicity, not computational efficiency. The use of transposes can be eliminated by using analogues of the
functions fi, gi, and wi in the second loop that are functions of the column coordinate of X.
where
The cosine transform provide the means of expressing an image as a weighted sum of cosine functions. The
weights in the sum are given by the cosine transform. Figure 8.5.1 illustrates this by showing how the square
wave
is approximated using the first five terms of the discrete cosine function.
For the two-dimensional cosine transform and its inverse are given by
and
respectively.
Figure 8.5.2 shows the image of a jet and the image which represents the pixel magnitude of its cosine
transform image.
Figure 8.5.1 Approximation of a square wave using the first five terms of the discrete cosine transform.
The image algebra formulations of the two-dimensional transform and its inverse are
and
respectively.
Figure 8.5.2 Jet image (left) and its cosine transform image.
The one-dimensional transforms are formulated similarly. The template used for the one-dimensional case is
The image algebra formulations of the one-dimensional discrete cosine transform and its inverse are
and
respectively.
In [5], a method for computing the one-dimensional cosine transform using the fast Fourier transform is
introduced. The method requires only a simple rearrangement of the input data into the FFT. The even and
odd terms of the original image , n = 2k are rearranged to form a new image b according to the
formula
Using the new arrangement of terms the cosine transform can be written as
Substituting x = n - 1 - x2 into the second sum and using the sum of two angles formula for cosine functions
yields
Therefore, it is only necessary to compute terms of the inverse Fourier transform in order to calculate
for u = 0,1,...,n - 1.
The inverse cosine transform can be obtained from the real part of the inverse Fourier transform of
With the information above it is easy to adapt the formulations of the one-dimension fast Fourier transforms
(Section 8.4) for the implementation of fast one-dimensional cosine transforms. The cosine transform and its
inverse are separable. Therefore, fast two-dimensional transforms implementations are possible by taking
one-dimensional transforms along the rows of the image, followed by one-dimensional transforms along the
columns [5, 6, 7].
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The term bj(z) denotes the jth bit in the binary expansion of z. For example,
form the basis of the Walsh transform and for each pair (x, u), gu(x) represents the (x, u) entry of the
Hadamard matrix mentioned above. Rewriting the expression for the inverse Walsh transform as
it is seen that a is a weighted sum of the basis functions gu(x). The weights in the sum are given by the Walsh
transform. Figure 8.6.1 shows the Walsh basis functions for n = 4 and how an image is written as a weighted
sum of the basis functions.
Thus, the Walsh transform and the Fourier transform are similar in that they both provide the coefficients for
the representation of an image as a weighted sum of basis functions. The basis functions for the Fourier
transform are sinusoidal functions of varying frequencies. The basis functions for the Walsh transform are the
elements of defined above. The rate of transition from negative to positive value in the Walsh
basis function is analogous to the frequency of the Fourier basis function. The frequencies of the basis
functions for the Fourier transform increase as u increases. However, the rate at which Walsh basis functions
change signs is not an increasing function of u.
and
respectively. Here again, a can be represented as a weighted sum of the basis functions
with coefficients given by the Walsh transform. Figure 8.6.2 shows the two-dimensional Walsh basis for n =
4. The function g(u, v)(x, y) is represented by the image guv in which pixel values of 1 are white and pixel values
of -1 are black.
Figure 8.6.3 shows the magnitude image (right) of the two-dimensional Walsh transform of the image of a jet
(left).
Figure 8.6.3 Jet image and the magnitude image of its Walsh transform image.
Image Algebra Formulation
The image algebra formulation of the fast Walsh transform is identical to that of the fast Fourier formulation
(Section 8.4), with the exception that the template t used for the Walsh transform is
The Walsh transform shares the important property of separability with the Fourier transform. Thus, the
two-dimensional Walsh transform can also be computed by taking the one-dimensional Walsh transforms
along each row of the image, followed by another one-dimensional Walsh transform along the columns.
In [11, 12], Zhu provides a fast version of the Walsh transform in terms of the p-product. Zhu’s method also
eliminates the reordering process required in most fast versions of the Walsh transform. Specifically, given a
one-dimensional signal , where m = 2k, the Walsh transform of a is given by
Note that the 2-product formulation of the Walsh transform involves only the values +1 and -1. Therefore
there is no multiplication involved except for final multiplication by the quantity .
Thus, the function Æm converts the vector [w1 •2(w2•2(...(wk •2a2)))] back into matrix form.
Additional p-product formulations for signals whose lengths are not powers of two can be found in [11, 12].
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The simplest example of a family of functions yielding a multiresolution analysis of the space of all
square-integrable functions on the real line is given by the Haar family of functions. These functions are well
----------- localized in space and are therefore appropriate for spatial domain analysis of signals and images.
Furthermore, the Haar functions constitute an orthogonal basis. Hence, the discrete Haar transform benefits
from the properties of orthogonal transformations. For example, these transformations preserve inner products
when interpreted as linear operators from one space to another. The inverse of an orthogonal transformation is
also particularly easy to implement, since it is simply the transpose of the direct transformation. Andrews [13]
points out that orthogonal transformations are entropy preserving in an information theoretic sense. A
particularly attractive feature of the Haar wavelet transform is its computational complexity, which is linear
with respect to the length of the input signal. Moreover, this transform, as well as other wavelet
representations of images (see Mallat [14]), discriminates oriented edges at different scales.
The appropriate theoretical background for signal and image analysis in the context of a multiresolution
analysis is given by Mallat [14], where multiresolution wavelet representations are obtained by way of
pyramidal algorithms. Daubechies provides a more general treatment of orthonormal wavelet bases and
multiresolution analysis [15], where the Haar multiresolution analysis is presented. The Haar function,
defined below, is the simplest example of an orthogonal wavelet. It may be viewed as the first of the family of
compactly supported wavelets discovered by Daubechies.
The discrete Haar wavelet transform is a separable linear transform that is based on the scaling function
and the dyadic dilations and integer translations of the Haar function
A matrix formulation of the Haar transform using orthogonal matrices follows. A convenient factorization of
these matrices leads to a pyramidal algorithm, which makes use of the generalized matrix product of order
two. The algorithm has complexity O (n).
for the range of integer indices and 0 d q < 2p. For fixed p and q, the function hpq(x) is a translation
p
by q of the function h(2 x), which is a dilation of the Haar wavelet function h(x). Furthermore, hpq(x) is
supported on an interval of length 2-p. The infinite family of functions
together with the scaling function g(x) constitute an orthogonal basis, known as the Haar basis, for the space
L2[0,1] of square-integrable functions on the unit interval. This basis can be extended to a basis for . The
Haar basis was first described in 1910 [16].
To obtain a discrete version of the Haar basis, note that any positive integer i can be written uniquely as
and
where i = 2p + q for i = 1, 2, ..., n - 1. The factor 2(p-k)/2 in Equation 8.7.1 normalizes the Euclidean vector
norm of the ui. Hence, ||ui||2 = 1 for all . The vector u0 is a normalized, discrete version of the scaling
function g(x). Similarly, the vectors ui for i = 1,2,...,n - 1 are normalized, discrete versions of the wavelet
functions hpq(x). Furthermore, these vectors are mutually orthogonal, i.e., the dot product
Now define the Haar matrices Hn for n = 2k, k a positive integer, by letting be the ith row vector
of Hn. The orthonormality of the row vectors of Hn implies that Hn is an orthogonal matrix, i.e.,
, where In is the n × n identity matrix. Hence, Hn is invertible with inverse
. Setting · = 2-1/2, the normalization factor may be written as
The signal a may be written as a linear combination of the basis vectors ui:
where the ci (for all ) are unknown coefficients to be determined. Equation 8.7.2 may be written as
where is the vector of unknown coefficients, i.e., c(i) = ci for each . Using
and solving for c, one obtains
Equation 8.7.4 defines the Haar transform HaarTransform1D for one-dimensional signals of length n = 2k,
i.e.,
Furthermore, an image may be reconstructed from the vector c of Haar coefficients using Equation 8.7.3.
Hence, define the inverse Haar transform InverseHaarTransform1D for one-dimensional signals by
To define the Haar transform for two-dimensional images, let m = 2k and n = 2l, with , and let
be a two-dimensional image. The Haar transform of a yields an m × n matrix c = [cij] of
coefficients given by
where aij = a(i, j) for all . Equation 8.7.5 defines the Haar transform
HaarTransform2D of a two-dimensional m × n image:
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The image a can be recovered from the m × n matrix c of Haar coefficients by the inverse Haar transform,
InverseHaarTransform1D, for two-dimensional images:
----------- i.e.,
where for all and for all . The outer product ui — vj may be
interpreted as the image of a two-dimensional, discrete Haar wavelet. The sum over all combinations of the
outer products, appropriately weighted by the coefficients cij, reconstructs the original image a.
and
From this factorization, it is clear that only pairwise sums and pairwise differences with multiplication by n
are required to obtain the Haar wavelet transforms (direct and inverse) of an image. Furthermore, the
one-dimensional Haar transform of a signal of length 2k may be computed in k stages. The number of stages
corresponds to the number of factors for the Haar matrix . Computation of the Haar transform of a signal
of length n requires 4(n - 1) multiplications and 2(n - 1) sums or differences. Hence, the computational
complexity of the Haar transform is O (n). Further remarks concerning the complexity of the Haar transform
can be found in Andrews [13] and Strang [17].
The image algebra formulation of the Haar wavelet transform may be expressed in terms of the normalization
constant · = 2-1/2, the Haar scaling vector g = (·, ·)2, and the Haar wavelet vector h = (·, -·)2, as described
below.
Note that the first entry of the resulting vector is a scaled global sum, since
The one-dimensional Haar transform of each row of a is computed first as an intermediate image b. The
second loop of the algorithm computes the Haar transform of each column of b. This procedure is equivalent
to the computation .
Example: Consider the gray scale rendition of an input image of a 32 × 32 letter “A” as shown in Figure
8.7.2. In the gray scale images shown here, black corresponds to the lowest value in the image and white
corresponds to the highest value. For the original image, black = 0 and white = 255.
Figure 8.7.2 Input image.
The row-by-row Haar transform of the original image is shown in Figure 8.7.3, and the column-by-column
Haar transform of the original image is shown in Figure 8.7.4.
The Daubechies family of orthonormal wavelet bases for the space of square-integrable functions
generalize the Haar wavelet basis. Each member of this family consists of the dyadic dilations and integer
translations
of a compactly supported wavelet function È(x). The wavelet function in turn is derived from a scaling
function Æ(x), which also has compact support. The scaling and wavelet functions have good localization in
both the spatial and frequency domains [15, 18]. Daubechies was the first to describe compactly supported
orthonormal wavelet bases [18].
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The orthonormal wavelet bases were developed by Daubechies within the framework of a multiresolution
analysis. Mallat [14] has exploited the features of multiresolution analysis to develop pyramidal
decomposition and reconstruction algorithms for images. The image algebra algorithms presented here are
based on these pyramidal schemes and build upon the work by Zhu and Ritter [12], in which the Daubechies
wavelet transform and its inverse are expressed in terms of the p-product. The one-dimensional wavelet
----------- transforms (direct and inverse) described by Zhu and Ritter correspond to a single stage of the
one-dimensional pyramidal image algebra algorithms. Computer routines and further aspects of the
computation of the Daubechies wavelet transforms may be found in Press et al. [19].
For every a discrete wavelet system, denoted by D2g is defined by a finite number of coefficients,
a0,a1,...,a2g-1. These coefficients, known as scaling or wavelet filter coefficients, satisfy specific orthogonality
conditions. The scaling function Æ(x) is defined to be a solution of
defines the wavelet function È(x) associated with the scaling function of the wavelet system:
To illustrate how the scaling coefficients are derived, consider the case g = 2. Let
and
Note that s and w satisfy the orthogonality relation s • w = 0. The coefficients a0,a1,a2, and a3 are uniquely
determined by the following equations:
Solving for the four unknowns in the four Equations 8.8.1 yields
In general, the coefficients for the discrete wavelet system D2g are determined by 2g equations in 2g
unknowns. Daubechies has tabulated the coefficients for g = 2, 3,..., 10 [15, 18]. Higher-order wavelets are
smoother but have broader supports.
Let . For the direct and inverse D2g wavelet transforms, let
and
denote the scaling vector and the wavelet vector, respectively, of the wavelet transform. Furthermore, let
and
In the computation of the wavelet transform of a signal or image, the scaling vector acts as lowpass filter,
while the wavelet vector acts as a bandpass filter.
The algorithms in this section make use of specific column and row representations of matrices. For
and , the column vector of f is defined by
where .
The following pyramidal algorithm computes the one-dimensional wavelet transform of f, where
f is treated as a row vector.
Remarks
• The wavelet transform is computed in R stages. Each stage corresponds to a scale or level of
resolution. The number of stages required to compute the wavelet transform is a function of the length
N = 2k of the input data vector and the length 2g of the wavelet or scaling vector.
Remarks
• The reconstructed data vector d has dimensions 1 × N.
• ds and dw are matrices having dimensions 2 × L.
Example: Let g = 2. The scaling coefficients for D4 are given by Equations 8.8.3. A typical wavelet in the D4
wavelet basis may be obtained by taking the inverse wavelet transform of a canonical unit vector of length N
= 2k. For k = 8, the graph of the inverse D4 wavelet transform of e11 is shown in Figure 8.8.1.
with .
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Example: Consider the gray scale rendition of an input image of a 32 × 32 letter “A” as shown in Figure
8.8.2. In the gray scale images shown here, black corresponds to the lowest value in the image and white
corresponds to the highest value. For the original image, black = 0 and white = 255.
-----------
Computing the column-by-column wavelet transform of the image in Figure 8.8.3, or computing the
row-by-row wavelet transform of the image in Figure 8.8.4, yields the D4 wavelet transform of the original
image. The result is shown in Figure 8.8.5. As in the Haar wavelet representation, the Daubechies wavelet
representation discriminates the location, scale, and orientation of edges in the image.
8.9. References
1 J. Cooley, P. Lewis, and P. Welch, “The fast Fourier transform and its applications,” IEEE
Transactions on Education, vol. E-12, no. 1, pp. 27-34, 1969.
2 J. Cooley and J. Tukey, “An algorithm for the machine calculation of complex Fourier series,”
Mathematics of Computation, vol. 19, pp. 297-301, 1965.
3 R. Gonzalez and P. Wintz, Digital Image Processing. Reading, MA: Addison-Wesley, second ed.,
1987.
4 G. Ritter, “Image algebra with applications.” Unpublished manuscript, available via anonymous ftp
from ftp://ftp.cis.ufl.edu/pub/src/ia/documents, 1994.
5 M. Narasimha and A. Peterson, “On the computation of the discrete cosine transform,” IEEE
Transactions on Communications, vol. COM-26, no. 6, pp. 934-936, 1978.
6 W. Chen, C. Smith, and S. Fralick, “A fast computational algorithm for the discrete cosine
transform,” IEEE Transactions on Communications, vol. COM-25, pp. 1004-1009, 1977.
7 N. Ahmed, T. Natarajan, and K. Rao, “Discrete cosine transform,” IEEE Transactions on Computers,
vol. C-23, pp. 90-93, 1974.
8 J. Walsh, “A closed set of normal orthogonal functions,” American Journal of Mathematics, vol. 45,
no. 1, pp. 5-24, 1923.
9 M. J. Hadamard, “Resolution d’une question relative aux determinants,” Bulletin des Sciences
Mathematiques, vol. A17, pp. 240-246, 1893.
10 R. E. A. C. Paley, “A remarkable series of orthogonal functions,” Proceedings of the London
Mathematical Society, vol. 34, pp. 241-279, 1932.
11 H. Zhu, The Generalized Matrix Product and its Applications in Signal Processing. Ph.D.
dissertation, University of Florida, Gainesville, 1993.
12 H. Zhu and G. X. Ritter, “The p-product and its applications in signal processing,” SIAM Journal of
Matrix Analysis and Applications, vol. 16, no. 2, pp. 579-601, 1995.
13 H. Andrews, Computer Techniques in Image Processing. New York: Academic Press, 1970.
14 S. Mallat, “A theory for multiresolution signal decomposition: the wavelet representation,” IEEE
Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674-693, 1989.
15 I. Daubechies, Ten Lectures on Wavelets. CBMS-NSF Regional Conference Series in Applied
Mathematics, Philadelphia, PA: SIAM, 1992.
16 A. Haar, “Zur theorie der orthogonalen funktionensysteme,” Mathematische Annalen, vol. 69, pp.
331-371, 1910.
17 G. Strang, “Wavelet transforms versus fourier transforms,” Bulletin of the American Mathematical
Society (new series), vol. 28, pp. 288-305, Apr. 1993.
18 I. Daubechies, “Orthonormal bases of wavelets,” Communications on Pure and Applied
Mathematics, vol. 41, pp. 909-996, 1988.
19 W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical Recipes in C. Cambridge:
Cambridge University Press, 1992.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 9
Pattern Matching and Shape Detection
----------- 9.1. Introduction
This chapter covers two related image analysis tasks: object detection by pattern matching and shape
detection using Hough transform techniques. One of the most fundamental methods of detecting an object of
interest is by pattern matching using templates. In template matching, a replica of an object of interest is
compared to all objects in the image. If the pattern match between the template and an object in the image is
sufficiently close (e.g., exceeding a given threshold), then the object is labeled as the template object.
The Hough transform provides for versatile methods for detecting shapes that can be described in terms of
closed parametric equations or in tabular form. Examples of parameterizable shapes are lines, circles, and
ellipses. Shapes that fail to have closed parametric equations can be detected by a generalized version of the
Hough transform that employs lookup table techniques. The algorithms presented in this chapter address both
parametric and non-parametric shape detection.
where the sum is over the support of p. The value of d will be small when a and p are almost identical and
large when they differ significantly. It follows that the term £ap will have to be large whenever d is small.
Therefore, a large value of £ap provides a good measure of a match. Shifting the pattern template p over all
possible locations of a and computing the match £ap at each location can therefore provide for a set candidate
pixels of a good match. Usually, thresholding determines the final locations of a possible good match.
The method just described is known as unnormalized correlation, matched filtering, or template matching.
There are several major problems associated with unnormalized correlation. If the values of a are large over
the template support at a particular location, then it is very likely that £ap is also large at that location, even if
no good match exists at that location. Another problem is in regions where a and p have a large number of
zeros in common (i.e., a good match of zeros). Since zeros do not contribute to an increase of the value £ap, a
mismatch may be declared, even though a good match exists. To remedy this situation, several methods of
normalized correlation have been proposed. One such method uses the formulation
where ± = £a and the sum is over the region of the support of p. Another method uses the factor
which keeps the values of the normalized correlation between -1 and 1. Values closer to 1 represent better
matches [1, 2].
The following figures illustrate pattern matching using normalized correlation. Figure 9.2.1 is an image of an
industrial site with fuel storage tanks. The fuel storage tanks are the objects of interest for this example, thus
the image of Figure 9.2.2 is used as the pattern. Figure 9.2.3 is the image representation for the values of
positive normalized correlation between the storage tank pattern and the industrial site image for each point in
the domain of the industrial site image. Note that the locations of the six fuel storage tanks show up as bright
spots. There are also locations of strong correlation that are not locations of storage tanks. Referring back to
the source image and pattern, it is understandable why there is relatively strong correlation at these false
locations. Thresholding Figure 9.2.3 helps to pinpoint the locations of the storage tanks. Figure 9.2.4
represents the thresholded correlation image.
Figure 9.2.3 Image representation of positive normalized correlation resulting from applying the pattern of
Figure 9.2.2 to the image of Figure 9.2.1.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
For (x + k, y + l) X, one assumes that a(x + k, y + l) = 0. It is also assumed that the pattern size is generally
smaller than the sensed image size. Figure 9.2.5 illustrates the correlation as expressed by Equation 9.2.1.
The following simple computation shows that this agrees with the formulation given by Equation 9.2.1.
By definition of the operation •, we have that
Since t is translation invariant, t(x, y)(u, v) = t(0, 0)(u - x, v - y). Thus, Equation 9.2.2 can be written as
Now t(0, 0)(u - x, v - y) = 0 unless (u - x, v - y) S(t(0, 0)) or, equivalently, unless -(m -1) d u - x d m - 1 and -(n -
1) d v - y d n - 1. Changing variables by letting k = u - x and l = v - y changes Equation 9.2.3 to
To compute the normalized correlation image c, let N denote the neighborhood function defined by N(y) =
S(ty). The normalized correlation image is then computed as
Note that £t(0, 0) is simply the sum of all pixel values of the pattern template at the origin.
where â denotes the Fourier transform of a, denotes the complex conjugate of , and the inverse
Fourier transform. Thus, simple pointwise multiplication of the image â with the image and Fourier
transforming the result implements the spatial correlation a •t.
One limitation of the matched filter given by Equation 9.3.1 is that the output of the filter depends primarily
on the gray values of the image a rather than on its spatial structures. This can be observed when considering
the output image and its corresponding gray value surface shown in Figure 9.3.2. For example, the letter E in
the input image (Figure 9.3.1) produced a high-energy output when correlated with the pattern letter B shown
in Figure 9.3.1. Additionally, the filter output is proportional to its autocorrelation, and the shape of the filter
output around its maximum match is fairly broad. Accurately locating this maximum can therefore be difficult
in the presence of noise. Normalizing the correlated image , as done in the previous
section, alleviates the problem for some mismatched patterns. Figure 9.3.3 provides an example of a
normalized output. An approach to solving this problem in the Fourier domain is to use phase-only matched
filters.
The transfer function of the phase-only matched filter is obtained by eliminating the amplitude of through
factorization. As shown in Figure 9.3.4, the output of the phase-only matched filter provides a
much sharper peak than the simple matched filter since the spectral phase preserves the location of objects but
is insensitive to the image energy [4].
Further improvements of the phase-only matched filter can be achieved by correlating the phases of both a
and t. Figure 9.3.5 shows that the image function approximates the Dirac ´-function at the
center of the letter B, thus providing even sharper peaks than the phase-only matched filter. Note also the
suppression of the enlarged B in the left-hand corner of the image. This filtering technique is known as the
symmetric phase-only matched filter or SPOMF [5].
Figure 9.3.1 The input image a is shown on the left and the pattern template t on the right.
Figure 9.3.2 The correlated output image and its gray value surface.
Figure 9.3.4 The phase-only correlation image and its gray value surface.
Figure 9.3.5 The symmetric phase-only correlation image and its gray value surface.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
To create the image , reflect the pattern template t across the origin by setting p(x) = t(0, 0)(-x). This will
correspond to conjugation in the spectral domain. Since t is an invariant template defined on , p is an
image defined over . However, what is needed is an image over domain(a) = X. As shown in Figure 9.2.5,
the support of the pattern template intersects X in only the positive quadrant. Hence, simple restriction of p to
X will not work. Additionally, the pattern image p needs to be centered with respect to the transformed image
â for the appropriate multiplication in the Fourier domain. This is achieved by translating p to the corners of a
and then restricting to the domain of a. This process is illustrated in Figure 9.3.6. Specifically, define
Note that there are two types of additions in the above formula. One is image addition and the other is
image-point addition which results in a shift of the image. Also, the translation of the image p, achieved by
vector addition, translates p one unit beyond the boundary of X in order to avoid duplication of the
intersection of the boundary of X with p at the corner of the origin.
Figure 9.3.6 The image p created from the pattern template t using reflection, translations to the corners of
the array X, and restriction to X. The values of p are zero except in the shaded area, where the values are
equal the corresponding gray level values in the support of t at (0,0).
The image is now given by , the Fourier transform of p. The correlation image c can therefore be
obtained using the following algorithm:
Using the image p constructed in the above algorithm, the phase-only filter and the symmetric phase-only
filter have now the following simple formulation:
and
respectively.
, where denotes the pseudoinverse of . A similar comment holds for the quotient
.
Some further improvements of the symmetric phase-only matched filter can be achieved by processing the
spectral phases [6, 7, 8, 9].
where x0 = y0 = N/2 denote the midpoint coordinates of the N × N domain of â. However, rotation of a(x, y)
rotates |â(u, v)| by the same amount. This rotational effect can be taken care of by transforming |â(u, v)| to
polar form (u, v) (r, ¸). A rotation of a(x, y) will then manifest itself as a shift in the angle ¸. After
determining this shift, the pattern template can be rotated through the angle ¸ and then used in one of the
standard correlation schemes in order to find the location of the pattern in the image.
The exact specification of this technique — which, in the digital domain, is by no means trivial — is provided
by the image algebra formulation below.
Previous Table of Contents Next
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Step 1. Extend the pattern template t to a pattern image over the domain of a.
This step can be achieved in a variety of ways. One way is to use the method defined in Section 9.3. Another
method is to simply set
This is equivalent to extending t(N/2,N/2) outside of its support to the zero image on .
Centering is a necessary step in order to avoid boundary effects when using bilinear interpolation at a
subsequent stage of this algorithm.
Step 4. Scale the Fourier spectrum.
This step is vital for the success of the proposed method. Image spectra decrease rather rapidly as a function
of increasing frequency, resulting in suppression of high-frequency terms. Taking the logarithm of the Fourier
spectrum increases the amplitude of the side lobes and thus provides for more accurate results when
employing the symmetric phase-only filter at a later stage of this algorithm.
The conversion of â and to continuous images is accomplished by using bilinear interpolation. An image
algebra formulation of bilinear interpolation can be found in [15]. Note that because of Step 4, â and are
real-valued images. Thus, if âb and denote the interpolated images, then , where
That is, âb and are real-valued images over a point set X with real-valued coordinates.
Although nearest neighbor interpolation can be used, bilinear interpolation results in a more robust matching
algorithm.
Step 6. Convert to polar coordinates.
Define the point set
The reason for choosing grid spacing in Step 6 is that the maximum value of r is
which prevents mapping the polar coordinates outside the set X. Finer sampling grids
will further improve the accuracy of pattern detection; however, computational costs will increase
proportionally. A major drawback of this method is that it works best only when a single object is present in
the image, and when the image and template backgrounds are identical.
Therefore,
Therefore,
which is the desired result.
It follows that combining the Fourier and Mellin transform with a rectangular to polar conversion yields a
rotation and scale invariant matching scheme. The approach takes advantage of the individual invariance
properties of these two transforms as summarized by the following four basic steps:
(1) Fourier transform
(4) SPOMF
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
As in the rotation invariant filter, the output of the SPOMF will result in two peaks which will be a distance À
apart in the ¸ direction and will be offset by a factor proportional to the scaling factor from the ¸ axis. For 0 d i
< N/2, this will correspond to an enlargement of the pattern, while for N/2 < i d N -1, the proportional scaling
factor will correspond to a shrinking of the pattern [5]. The following example (Figures 9.5.1 to 9.5.3)
illustrates the important steps of this algorithm.
Figure 9.5.1 The input image a is shown on the left and the pattern template p on the right.
The Hough transform is a mapping from into the function space of sinusoidal functions. It was first
formulated in 1962 by Hough [16]. Since its early formulation, this transform has undergone intense
investigations which have resulted in several generalizations and a variety of applications in computer vision
and image processing [1, 2, 17, 18, 19]. In this section we present a method for finding straight lines using the
Hough transform. The input for the Hough transform is an image that has been preprocessed by some type of
edge detector and thresholded (see Chapters 3 and 4). Specifically, the input should be a binary edge image.
A straight “line” in the sense of the Hough algorithm is a colinear set of points. Thus, the number of points in
a straight line could range from one to the number of pixels along the diagonal of the image. The quality of a
straight “line” is judged by the number of points in it. It is assumed that the natural straight lines in an image
correspond to digitized straight “lines” in the image with relatively large cardinality.
A brute force approach to finding straight lines in a binary image with N feature pixels would be to examine
all possible straight lines between the feature pixels. For each of the possible lines, N - 2
tests for colinearity must be performed. Thus, the brute force approach has a computational complexity on the
order of N3. The Hough algorithm provides a method of reducing this computational cost.
To begin the description of the Hough algorithm, we first define the Hough transform and examine some of
its properties. The Hough transform is a mapping h from into the function space of sinusoidal functions
defined by
To see how the Hough transform can be used to find straight lines in an image, a few observations need to be
made.
Any straight line l0 in the xy-plane corresponds to a point (Á0, ¸0) in the Á¸-plane, where ¸0 [0, À) and
. Let n0 be the line normal to l0 that passes through the origin of the xy-plane. The angle n0
makes with the positive x-axis is ¸0. The distance from (0, 0) to l0 along n0 is |Á0|. Figure 9.6.1 below
illustrates the relation between l0, n0, ¸0, and Á0. Note that the x-axis in the figure corresponds to the point (0,
0), while the y-axis corresponds to the point (0, À/2).
As an example, consider the points (1, 7), (3, 5), (5, 3), and (6, 2) in the xy-plane that lie along the line l0 with
¸ and Á representation and , respectively.
Figure 9.6.2 shows these points and the line l0.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The Hough transform maps the indicated points to the sinusoidal functions as follows:
-----------
The graphs of these sinusoidal functions can be seen in Figure 9.6.3. Notice how the four sinusoidal curves
intersect at and .
Each point (x, y) at a feature pixel in the domain of the image maps to a sinusoidal function by the Hough
transform. If the feature point (xi, yi) of an image lies on a line in the xy-plane parameterized by (Á0, ¸0) (¸0,
Á0), its corresponding representation as a sinusoidal curve in the Á¸-plane will intersect the point (Á0, ¸0).
Also, the sinusoid Á = x cos (¸) + y sin(¸) will intersect (Á0, ¸0) only if the feature pixel location (x, y) lies on
the line (Á0, ¸0) in the xy-plane. Therefore, it is possible count the number of feature pixel points that lie along
the line (Á0, ¸0) in the xy-plane by counting the number of sinusoidal curves in the Á¸ plane that intersect at
the point (Á0, ¸0). This observation is the basis of the Hough line detection algorithm which is described next.
Obviously, it is impossible to count the number of intersection of sinusoidal curves at every point in the Á
¸-plane. The Á¸-plane for -R d Á d R, 0 d ¸ < À must be quantized. This quantization is represented as an
accumulator array a(i, j). Suppose that for a particular application it is decided that the Á¸-plane should be
quantized into an r × c accumulator array. Each column of the accumulator represents a increment in the
angle ¸. Each row in the accumulator represents a increment in Á. The cell location a(i, j) of the
accumulator is used as a counting bin for the point in the Á¸-plane (and the
corresponding line in the xy-plane).
Initially, every cell of the accumulator is set to 0. The value a(i, j) of the accumulator is incremented by 1 for
every feature pixel (x, y) location at which the inequality
The criterion for a good line in the Hough algorithm sense is a large number of colinear points. Therefore, the
larger entries in the accumulator are assumed to correspond to lines in the image.
Computation of this variant template sum is computationally intensive and inefficient. A more efficient
implementation is given below.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
by
The statement
The Hough transform used for circle detection is a map defined over feature points in the domain of the image
into the function space of conic surfaces. The Hough transform h used for circle detection is the map
Previous Table of Contents Next
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Note that the Hough transform is only applied to the feature points in the domain of the image. The surface (Ç
- x0)2 + (È - y0)2 - Á2 = 0 is a cone in ÇÈÁ space with ÇÈ intercept (x0, y0, 0) (see Figure 9.7.1).
-----------
The point (xi, yi) lies on the circle in the xy-plane if and only if the conic
2 2 2
surface (Ç - xi) + (È - yi) - Á = 0 intersects the point (Ç0, È0, Á0) in ÇÈÁ-space. For example, the points (-1,
0) and (1, 0) lie on the circle f((x, y), (0, 0, 1)) = x2 + y2 - 1 = 0 in the xy-plane. The Hough transform used for
circle detection maps these points to conic surfaces in ÇÈÁ-space as follows
These two conic surfaces intersect the point (0, 0, 1) in ÇÈÁ space (see Figure 9.7.2).
Figure 9.7.2 The two conics corresponding to the two points (-1, 0) and (1, 0) on the circle x2 + y2 = 1 under
the Hough transform h.
The feature point (xi, yi) will lie on the circle (Ç0, È0, Á0) in the xy-plane if and only if its image under the
Hough transform intersects the point (Ç0, È0, Á0) in ÇÈÁ-space. More generally, the point xi will lie on the
curve f(x, p0) = 0 in the domain of the image if and only if the curve f(xi, p) = 0 intersects the point p0 in the
parameter space. Therefore, the number of feature points in the domain of the image that lie on the curve f(x,
p0) = 0 can be counted by counting the number of elements in the range of the Hough transform that intersect
p0.
As in the case of line detection, the parameter space must be quantized. The accumulator matrix is the
representation of the quantized parameter space. For circle detection the accumulator a will be a
three-dimensional matrix with all entries initially set to 0. The entry a(Çr, Ès, Át) is incremented by 1 for
every feature point (xi, yi) in the domain of the image whose conic surface in ÇÈÁ-space passes through (Çr,
Ès, Át). More precisely, a(Çr, Ès, Át) is incremented provided
where µ is used to compensate for digitization and quantization. Shapiro [20, 21, 22] discusses error analysis
when using the Hough transform. If the above inequality holds, it implies that the conic surface (Ç - xi)2 + (È -
yi)2 - Á = 0 passes trough the point (Çr, Ès, Át) (within a margin of error) in ÇÈÁ space. This means the point
(xi, yi) lies on the circle (x - Çr)2 + (y - Ès)2 - Át = 0 in the xy-plane, and thus the accumulator value a(Çr, Ès,
Át) should be incremented by 1.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Circles are among the most commonly found objects in images. Circles are special cases of ellipses, and
circles viewed at an angle appear as ellipses. We extend circle detection to ellipse detection next. The method
for ellipse detection that we will be discussing is taken from Ballard [23].
The equation for an ellipse centered at (Ç, È) with axes parallel to the coordinate axes (Figure 9.7.3) is
-----------
which gives us
For ellipse detection we will assume that the original image has been preprocessed by a direction edge
detector and thresholded based on edge magnitude (Chapters 3 and 4). Therefore, we assume that an edge
direction image d [0, 2À) X exist, where X is the domain of the original image. The direction d(x, y) is the
direction of the gradient at the point (x, y) on the ellipse. The tangent to the ellipse at
With this edge direction and orientation information we can write Ç and È as
and
respectively.
The accumulator array for ellipse detection will be a five-dimensional array a. Every entry of a is initially set
to zero. For every feature point (x, y) of the edge direction image, the accumulator cell a(Çr, Ès, ¸t, ±u, ²v) is
incremented by 1 whenever
and
Larger accumulator entry values are assumed to correspond to better ellipses. If an accumulator entry is
judged large enough, its coordinates are deemed to be the parameters of an ellipse in the original image.
It is important to note that gradient information is used in the preceding description of an ellipse. As a
consequence, gradient information is used in determining whether a point lies on an ellipse. Gradient
information shows up as the term
in the equations that were derived above. The incorporation of gradient information improves the accuracy
and computational efficiency of the algorithm. Our original example of circle detection did not use gradient
information. However, circles are special cases of ellipses and circle detection using gradient information
follows immediately from the description of the ellipse detection algorithm.
Similar to the implementation of the Hough transform for line detection, efficient incrementing of
accumulator cells can be obtained by defining the neighborhood function N : Y ’ 2X by
The accumulator array can now be computed by using the following image algebra pseudocode:
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The R-table is indexed by values of gradient angles for points on the boundary of S(Ç, È). For each
gradient angle in the R-table, there corresponds a set of (r, ±) magnitude-angle pairs. The pair (r, ±)
is an element of if and only if there exists a point (x, y) on the boundary of S(Ç, È) whose gradient
direction is and whose vector has magnitude r and direction ±. Thus,
If (x, y) S(Ç, È) there is a , 1 d i d K, in the R-table such that . For that there is a
and
An important observation to make about the R-table is that by indexing the gradient direction, gradient
information is incorporated into the shape’s description. As in the case of ellipse detection (Section 9.7), this
will make the Hough algorithm more accurate and more computationally efficient.
As it stands now, the shape S(Ç, È) is parameterized only by the location of its reference point. The R-table
for S(Ç, È) can only be used to detect shapes that are the result of translations of S(Ç, È) in the xy plane.
A scaling factor can be accommodated by noting that a scaling of S(Ç, È) by s is represent by scaling all the
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The entries of the accumulator are indexed by a quantization of the shape’s ÇÈs¸ parameter space. Thus, the
accumulator is a four-dimensional array with entries
-----------
Entry a(Çs, Èt, su, ¸w) is the counting bin for points in the xy plane that lie on the boundary of the shape S(Çs,
Èt, su, ¸w). Initially all the entries of the accumulator are 0.
For each feature edge pixel (x, y) add 1 to accumulator entry a(Çs, Èt, su, ¸w) if the following condition,
denoted by C(x, y, Çs, Èt, su, ¸w, µ1, µ2, µ3), holds. For positive numbers µ1, µ2, and µ3, C(x, y, Çs, Èt, su, ¸w, µ1,
µ2, µ3) is satisfied if there exists a 1 d k d Nj such that
and
when . The µ1, µ2, and µ3 are error tolerances to allow for
quantization and digitization.
Larger counts at an accumulator cell mean a higher probability that the indexes of the accumulator cell are the
parameters of a shape in the image.
The set is the set from the R-table that corresponds to the gradient angle . The accumulator is
constructed with the image algebra statement
9.9. References
1 R. Gonzalez and P. Wintz, Digital Image Processing. Reading, MA: Addison-Wesley, second ed.,
1987.
2 D. Ballard and C. Brown, Computer Vision. Englewood Cliffs, NJ: Prentice-Hall, 1982.
3 W. Rudin, Real and Complex Analysis. New York, NY: McGraw-Hill, 1974.
4 A. Oppenheim and J. Lim, “The importance of phase in signals,” IEEE Proceedings, vol. 69, pp.
529-541, 1981.
5 Q. Chen, M. Defrise, and F. Deconick, “Symmetric phase-only matched filtering of Fourier-Mellin
transforms for image registration and recognition,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 16, pp. 1156-1168, July 1994.
6 V. K. P. Kumar and Z. Bahri, “Efficient algorithm for designing a ternary valued filter yielding
maximum signal to noise ratio,” Applied Optics, vol. 28, no. 10, 1989.
7 O. K. Ersoy and M. Zeng, “Nonlinear matched filtering,” Journal of the Optical Society of America,
vol. 6, no. 5, pp. 636-648, 1989.
8 J. Horver and P. Gianino, “Pattern recognition with binary phase-only filters,” Applied Optics, vol.
24, pp. 609-611, 1985.
9 D. Flannery, J. Loomis, and M. Milkovich, “Transform-ratio ternary phase-amplitude filter
formulation for improved correlation discrimination,” Applied Optics, vol. 27, no. 19, pp. 4079-4083,
1988.
10 D. Casasent and D. Psaltis, “Position, rotation and scale-invariant optical correlation,” Applied
Optics, vol. 15, pp. 1793-1799, 1976.
11 D. Casasent and D. Psaltis, “New optical transform for pattern recognition,” Proceedings of the
IEEE, vol. 65, pp. 77-84, 1977.
12 Y. Hsu, H. Arsenault, and G. April, “Rotation-invariant digital pattern recognition using circular
harmonic expansion,” Applied Optics, vol. 21, no. 22, pp. 4012-4015, 1982.
13 Y. Sheng and H. Arsenault, “Experiments on pattern recognition using invariant Fourier-Mellin
descriptors,” Journal of the Optical Society of America, vol. 3, no. 6, pp. 771-776, 1986.
14 E. D. Castro and C. Morandi, “Registration of translated and rotated images using finite Fourier
transforms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, no. 5, pp.
700-703, 1987.
15 G. Ritter, “Image algebra with applications.” Unpublished manuscript, available via anonymous ftp
from ftp://ftp.cis.ufl.edu/pub/src/ia/documents, 1994.
16 P. Hough, “Method and means for recognizing complex patterns,” Dec. 18 1962. U.S. Patent
3,069,654.
17 R. Duda and P. Hart, “Use of the Hough transform to detect lines and curves in pictures,”
Communications of the ACM, vol. 15, no. 1, pp. 11-15, 1972.
18 E. R. Davies, Machine Vision: Theory, Algorithms, Practicalities. New York: Academic Press,
1990.
19 A. Rosenfeld, Picture Processing by Computer. New York: Academic Press, 1969.
20 S. D. Shapiro, “Transformations for the noisy computer detection of curves in noisy pictures,”
Computer Graphics and Image Processing, vol. 4, pp. 328-338, 1975.
21 S. D. Shapiro, “Properties of transforms for the detection of curves in noisy image,” Computer
Graphics and Image Processing, vol. 8, pp. 219-236, 1978.
22 S. D. Shapiro, “Feature space transforms for curve detection,” Pattern Recognition, vol. 10, pp.
129-143, 1978.
23 D. Ballard, “Generalizing the Hough transform to detect arbitrary shapes,” Pattern Recognition,
vol. 13, no. 2, pp. 111-122, 1981.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 10
Image Features and Descriptors
----------- 10.1. Introduction
The purpose of this chapter is to provide examples of image feature and descriptor extraction in an image
algebra setting. Image features are useful extractable attributes of images or regions within an image.
Examples of image features are the histogram of pixel values and the symmetry of a region of interest.
Histograms of pixel values are considered a primitive characteristic, or low-level feature, while such
geometric descriptors as symmetry or orientation of regions of interest provide examples of high-level
features. Significant computational effort may be required to extract high-level features from a raw gray-value
image.
Generally, image features provide ways of describing image properties, or properties of image components,
that are derived from a number of different information domains. Some features, such as histograms and
texture features, are statistical in nature, while others, such as the Euler number or boundary descriptors of
regions, fall in the geometric domain. Geometric features are key in providing structural descriptors of
images. Structural descriptions of images are of significant interest. They are an important component of
high-level image analysis and may also provide storage-efficient representations of image data. The objective
of high-level image analysis is to interpret image information. This involves relating image gray levels to a
variety of non-image-related knowledge underlying scene objects and reasoning about contents and structures
of complex scenes in images. Typically, high-level image analysis starts with structural descriptions of
images, such as a coded representation of an image object or the relationships between objects in an image.
The descriptive terms “inside of” and “adjacent to” denote relations between certain objects. Representation
and manipulation of these kinds of relations is fundamental to image analysis. Here one assumes that a
preprocessing algorithm has produced a regionally segmented image whose regions represent objects in the
image, and considers extracting the relation involving the image regions from the regionally segmented
image.
10.2. Area and Perimeter
Area and perimeter are commonly used descriptors for regions in the plane. In this section image algebra
formulations for the calculation of area and perimeter are presented.
Let R denote the region in whose points have pixel value 1. One way to calculate the area A of
R is simply to count the number of points in R. This can be accomplished with the image algebra statement
The perimeter P of R may be calculated by totaling the number of 0-valued 4-neighbors for each point in R.
The image algebra statement for the perimeter of R using this definition is given by
Let b := a•t. The alternate formulations for area and perimeter are given by
and
respectively.
Here, c4(a) denotes the number of 4-connected feature components of a and h8(a) is the number of
8-connected holes within the feature components.
The 8-connected Euler number, e8(a), of a is defined by
where c8(a) denotes the number of 8-connected feature components of a and h4(a) is the number of
4-connected holes within the feature components.
Table 10.3.1 shows Euler numbers for some simple pixel configuration. Feature pixels are black.
Table 10.3.1 Examples of Pixel Configurations and Euler Numbers
No. a c4(a) c8(a) h4(a) h8(a) e4(a) e8(a)
1. 1 1 0 0 1 1
2. 5 1 0 0 5 1
3. 1 1 1 1 0 0
4. 4 1 1 0 4 0
5. 2 1 4 1 1 -3
6. 1 1 5 1 0 -4
7. 2 2 1 1 1 1
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Let a {0,1} x be a Boolean input image and let b : a•t where is the template represented in
Figure 10.3.1. The 4-connected Euler number is expressed as
-----------
Figure 10.3.1 Pictorial representation of the template used for determining the Euler number.
The image algebra expressions of the two types of Euler numbers were derived from the quad pattern
formulations given in [1, 3].
The 4-connected and 8-connected Euler numbers can then be evaluated using lookup tables via the
expressions
respectively.
The value of j at which Ácc2 takes on its maximum is the offset of the segment of c2 that best matches c. The
closer the maximum is to 1 the better the match.
Image Algebra Formulation
Chain Code Extraction
In this section we present an image algebra formulation for extracting the chain code from an 8-connected
object in a boolean image. This algorithm is capable of extracting the chain code from objects typically
considered to be poor candidates for chain coding purposes. Specifically, this algorithm is capable of coping
with pinch points and feelers (as shown in Figure 10.4.3).
Let a {0, 1} x be the source image. The chain code extraction algorithm proceeds in three phases. The first
phase uses the census template t shown in Figure 10.4.4 together with the linear image-template product
operation • to assign each point x X a number between 0 and 511. This number represents the configuration
of the pixel values of a in the 8-neighborhood about x. Neighborhood configuration information is stored in
the image , where
n := a •t.
This information will be used in the last phase of the algorithm to guide the selection of directions during the
extraction of the chain code.
The next step of the algorithm extracts a starting point (the initial point) on the interior 8-boundary of the
object and provides an initial direction. This is accomplished by the statement:
Here we assume the lexicographical (row scanning) order. Hence, the starting point x0 is the lexicographical
minimum of the point set domain(a||1). For a clockwise traversal of the object’s boundary the starting
direction d is initially set to 0.
Given a direction, the function ´ provides the increments required to move to the next point along the path of
the chain code in X. The values of ´ are given by
´(0) = (0, 1),
´(1) = (-1, 1),
´(2) = (-1, 0),
´(3) = (-1, -1),
´(4) = (0, -1),
´(5) = (1, -1),
´(6) = (1, 0), and
´(7) = (1, 1).
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The key to the algorithm is the function f. Given a previous direction d and a neighborhood characterization
number c = n(x), the function f provides the next direction in the chain code. The value of f is computed using
the following pseudocode:
f := -1
----------- i := 0
while f = -1 loop
f := (d + 2 - i) mod 8
if bf(c = 0 then)
f := -1
end if
i := i + 1
end loop.
Note that f as defined above is designed for a clockwise traversal of the object’s boundary. For a
counterclockwise traversal, the line
f := (d + 2 - i) mod 8
should be replaced by
f := (d + 6 + i) mod 8.
Also, the starting direction d for the algorithm should be set to 5 for the counterclockwise traversal.
Let a {0, 1} x be the source image and let be the image variable used to store the chain code.
Combining the three phases outlined above we arrive at the following image algebra pseudocode for chain
code extraction from an 8-connected object contained in a:
n := a •t
x := x0
d := 0
d0 := f(d, n(x0))
i := 0
loop
d := f(d, n(x))
if d = d0 and x = x0 then
break
end if
c(i) := d
x := x + ´(d)
i := i + 1
end loop.
In order to reconstruct the object from its chain code, the algorithm below can be used to generate the object’s
interior 8-boundary.
a := 0
a(x0 := 1)
x := x0 + ´(c(0))
for i in 1..n - 1 loop
a(x) := 1
x := x + ´(c(i))
end loop.
The region bounded by the 8-boundary can then be filled using one of the hole filling algorithms of Section
6.6.
A region, for the purposes of this section, is a subset of the domain of whose points are all mapped
to the same (or similar) pixel value by a. Regions will be indexed by the pixel value of their members, that is,
region Ri is the set {x X : a(x) = i}.
Adjacency of regions is an intuitive notion, which is formalized for by the adj relation.
Here d denotes the Euclidean distance between x = (x1,x2) and y = (y1,y2); i.e.,
. In other words, Ri and Rj are adjacent if and only if there is a point in
Ri and a point in Rj that are a distance of one unit apart.
The adjacency relation, denoted by adj, can be represented by a set of ordered pairs, an image, or a graph. The
ordered pair representation of the adj relation is the set
The adj relation is symmetric (Ri adj Rj Ò Rj adj Ri). Hence, the above set contains redundant information
regarding the relation. This redundancy is eliminated by using the set {(i,j) : i > j and Ri adj Rj} to represent
the adjacency relation.
The image b {0,1} Y defined by
where Y = {(i,j) : 2 d i d ¦a, 1 d j < i} is the image representation of the adjacency relation among the regions
in X.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Region numbers form the vertices of the graph representation of the adj relation; edges are the pairs (i,j)
where Ri adj Rj. Thus, there are three easy ways to represent the notion of region adjacency. As an example,
we provide the three representations for the adjacency relation defined over the regions in Figure 10.5.1.
-----------
Let be the input image, where X is an m × n array. In order for the adjacency relation to be
meaningful, the image a in most applications will be the result of preprocessing an original image with some
type of segmentation technique; e.g., component labeling (Sections 6.2 and 6.3).
The adjacency relation for regions in X will be represented by the image b {0,1} Y, where Y = {(i,j) : 2 d i d
Va, 1 d j < i}. The parameterized template used to determine of adjacency is defined by
For a fixed point (x0,y0) of X, is an image in {0,1}Y. If regions Ri and Rj “touch” at the point
is equal to 1 if Ri and Rj are adjacent and 0 otherwise. The ordered pair representation of the adjacency
relation is the set
domain(b||=1).
thus in the graph above each region is assumed to have an edge that points back to itself.
(c) Image — The inclusion is represented by the image a defined by
The following is the image representation of the inclusion relation for Figure 10.6.1.
i\j 1 2 3 4 5 6
1 1 0 0 0 0 0
2 1 1 0 0 0 0
3 1 0 1 0 0 0
4 1 1 0 1 0 0
5 1 0 1 0 1 0
6 1 1 0 0 0 1
Note that regions are sets; however, the inside of relation and set inclusion † relation are distinct concepts.
Two algorithms used to describe the inclusion relation among regions in a image are presented next.
Previous Table of Contents Next
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The algorithm above requires applications of the hole filling algorithm. Each application of the hole
filling algorithm is, in itself, computationally very expensive. The alternate image algebra formulation
assumes that regions at the same level of the inside of graph are not adjacent. This assumption will allow for
greater computational efficiency at the cost of loss of generality.
The alternate image algebra formulation makes its determination of inclusion based on pixel value transitions
along a row starting from a point (x0,y0) inside a region and moving in a direction toward the image’s
boundary. More specifically, if the image a is defined over an m × n grid, the number of transitions from
region Ri to region Rj starting at (x0,y0) moving along a row toward the right side of the image is given by
Let , where be the source image. Let I be the ordered pair representation of
Let be the reflection f(x,y) = (y,x). The alternate image algebra formulation used to generate
the ordered pair representation of the inclusion relation for regions in a is
Here denotes the composition of h and f, and . A nice
property of this algorithm is that one execution of the for loop finds all the regions that contain the region
whose index is the current value of the loop variable. The first algorithm presented required nested for loops.
Note that for each point is an image whose value at (i,j) indicate
whether the line crosses the boundary from Ri to Rj at (x0,y0 + k - 1). Thus,
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
To obtain the set of quadtree codes for a 2n × 2n binary image, we assign the individual black pixels of the
image their corresponding quadtree codes, and work upwards to merge 4 codes representing 4 black nodes at
a lower level into a code representing the black node consisting of those 4 black nodes if possible. The
merging process can be performed as follows: whenever 4 black nodes at the (k + 1)th level are encoded as
, we merge those codes into which represents a kth-level
black node consisting of the 4 black nodes at the (k + 1)th level.This scheme provides an algorithm to
transforms a 2n × 2n binary image into its quadtree representation.
Let the source image be an image on X = {(i,j) : 0 d i,j d 2n - 1}. Define an image by
where
and
f1(y1,y2) = (2y1,2y2)
and
Now, if Q denotes the set of quadtree codes to be constructed, then Q can be obtained by using the following
algorithm.
In the above algorithm, which first appeared in Shi [7], ak is a 2k × 2k image in which any positive pixel
encodes a kth-level black node which may be further merged to a big node at the (k - 1)th level. The image bk
is a 2k-1 × 2k-1 image in which any positive pixel encodes a black node merged from the corresponding four
black nodes at the kth level. The image ck is a 2k × 2k binary image in which any pixel with value 1 indicates
that the corresponding node cannot be further merged. Thus, a positive pixel of ak · ck encodes a kth level
black node which cannot be further merged.
Let be an image that represents an object consisting of positive-valued pixels that is set against a
background of 0-valued pixels. Position refers to the location of the object in the plane. The objects’s centroid
(or center of mass) is the point that is used to specify its position. The centroid is the point whose
coordinates are given by
Figure 10.8.1 shows the location of the centroid for an SR71 object.
Orientation refers to how the object lies in the plane. The object’s moment of inertia is used to determine its
angle of orientation. The moment of inertia about the line with slope tan ¸ passing through is defined
as
The angle ¸min that minimizes M¸ is the direction of the axis of least inertia for the object. If ¸min is unique, then
the line in the direction ¸min is the principal axis of inertia for the object. For a binary image the principal axis
of inertia is the axis about which the object appears elongated. The principal axis of inertia in the direction ¸ is
shown for the SR71 object in Figure 10.8.1. The line in the direction ¸max is the axis about which the moment
of inertia is greatest. In a binary image, the object is widest about this axis. The angles ¸min and ¸max are
separated by 90°.
Let
Figure 10.8.1 Centroid and angle ¸ of orientation.
The moment of inertia about the line in the direction ¸ can be written as
M¸ = m20sin2¸ - 2m11sin¸ cos¸ + m02cos2¸.
for critical values of ¸. Using the identity the search for critical values leads to the
quadratic equation
Solving the quadratic equations leads to two solutions for tan ¸, and hence two angles ¸min and ¸max. These are
the angles of the axes about which the object has minimum and maximum moments of inertia, respectively.
Determining which solution of the quadratic equation minimizes (or maximizes) the moment of inertia
requires substitution back into the equation for M¸.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
For the digital image the central moments of order 2 are given by
-----------
Symmetry is the ratio, , of the minimum moment of inertia to the maximum moment of
inertia. For binary images symmetry is a rough measure of how “elongated” an object is. For example, a circle
has a symmetry ratio equal to 1; a straight line has a symmetry ratio equal to 0.
The image algebra formulations for the coordinates of the centroid are
The image algebra formulas for the central moments of order 2 used to compute the angle of orientation are
given by
Let f be a continuous function defined over . The moment of order (p,q) of f is defined by
where p,q {0,1,2,…}. It has been shown [13] that if f is a piecewise continuous function with bounded
support, then moments of all orders exist. Furthermore, under the same conditions on f, mpq is uniquely
determined by f, and f is uniquely determined by mpq.
where
The point is called the image centroid. The center of gravity of an object is the physical analogue of
the image centroid.
Tale 10.9.1 lists moment invariants for transformations applied to the binary image of the character “A.”
There is some discrepancy within the Æi due to digitization.
10.10. Histogram
The histogram of an image is a function that provides the frequency of occurrence for each intensity level in
the image. Image segmentation schemes often incorporate histogram information into their strategies. The
histogram can also serve as a basis for measuring certain textural properties, and is the major statistical tool
for normalizing and requantizing an image [12, 14].
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
-----------
since represents the number of pixels having value j i.e., since and
Çj(a(x)) = 1 Ô a(x) = j.
The normalized cumulative histogram is analogous to the cumulative probability function of statistics.
Figure 10.11.1 shows an image of a jet superimposed over a graphical representation of its cumulative
histogram.
given image , where and G denotes the number of expected gray levels, all
second-order statistical measures are completely specified by the joint probability
s(”x,”y : i,j) = probability{a(x,y) = i and a(x + ”x,y + ”y) = j},
where . Thus, s(”x,”y : i,j) is the probability that an arbitrary pixel location (x,y) has gray
level i, while pixel location (x + ”x,y + ”y) has gray level j.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
For illustration purposes, let X be a 4 × 4 grid and be the image represented in Figure 10.12.1
----------- below. In this figure, we assume the usual matrix ordering of a with a(0,0) in the upper left-hand corner.
For purposes of texture classification, many researchers do not distinguish between s(d,¸ : i,j) and s(d,¸ : j,i);
that is, they do not distinguish which one of the two pixels that are (”x, ”y) apart has gray value i and which
one has gray value j. Therefore comparison between two textures is often based on the texture co-occurence
matrix
c(d,¸) = s(d,¸) + s2(d,¸),
where s2 denotes the transpose of s. Using the SGLD matrices of the above example yields the following
texture co-occurence matrices associated with the image shown in Figure 10.12.1:
Since co-occurence matrices have the property that c(d,¸) = c(d,¸ + À) — which can be ascertained from their
symmetry — the rationale for choosing only angles between 0 and À becomes apparent.
Co-occurence matrices are used in defining the following more complex texture statistics. The marginal
probability matrices cx(d,¸ : i) and cy(d,¸ : j) of c(d,¸) are defined by
and
Five commonly used features for texture classification that we will reformulate in terms image algebra are
defined below.
(a) Energy:
(b) Entropy:
(c) Correlation:
(e) Inertia:
Thus, for example, choosing ”x = 1 and ”y = 0 computes the image s(1,0°), while choosing ”x = 1 and ”y = 1
computes s(1, 45°).
The co-occurence image c is given by the statement
c(”x,”y) := s(”x,”y) + s2(”x,”y).
The co-occurence image can also be computed directly by using the parametrized template
defined by
If i=j, then we obtain which corresponds to the doubling effect along the diagonal when adding
the matrices s + s2. Although the template method of computing c(”x,”y) can be stated as a simple one-line
formula, actual computation using this method is very inefficient since the template is translation variant.
The marginal probability images cx(”x,”y) and cy(”x,”y) can be computed as follows:
The actual computation of these sums will probably be achieved by using the following loops:
and
Another version for computing cx(”x,”y) and cy(”x,”y), which does not involve loops, is obtained by defining
the neighborhood functions and by
Nx(i) = {(i,j) : j = 0, 1, … , G - i}
and
Ny(j) = {(i,j) : i = 0, 1, … , G - 1},
respectively. The marginal probability images can then be computed using the following statements:
This method of computing cx(”x,”y) and cy(”x,”y) is preferable when using special mesh-connected
architectures.
(b) Entropy:
(c) Correlation:
(e) Inertia:
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Suppose c(1,0°) is the spatial dependence matrix (at distance 1, direction 0°) for an image defined over an m ×
n grid. There are 2n(m - 1) nearest horizontal neighborhood pairs in the m × n grid. Thus, for this example
10.13. References
1 W. Pratt, Digital Image Processing. New York: John Wiley, 1978.
2 G. Ritter, “Image algebra with applications.” Unpublished manuscript, available via anonymous ftp
from ftp://ftp.cis.ufl.edu/pub/src/ia/documents, 1994.
3 A. Rosenfeld and A. Kak, Digital Picture Processing. New York, NY: Academic Press, 2nd ed.,
1982.
4 H. Freeman, “On the encoding of arbitrary geometric configurations,” IRE Transactions on
Electronic Computers, vol. 10, 1961.
5 H. Freeman, “Computer processing of line-drawing images,” Computing Surveys, vol. 6, pp. 1-41,
Mar. 1974.
6 G. Ritter, “Recent developments in image algebra,” in Advances in Electronics and Electron Physics
(P. Hawkes, ed.), vol. 80, pp. 243-308, New York, NY: Academic Press, 1991.
7 H. Shi and G. Ritter, “Structural description of images using image algebra,” in Image Algebra and
Morphological Image Processing III, vol. 1769 of Proceedings of SPIE, (San Diego, CA), pp. 368-375,
July 1992.
8 I. Gargantini, “An effective way to represent quadtrees,” Communications of the ACM, vol. 25, pp.
905-910, Dec. 1982.
9 A. Rosenfeld and A. Kak, Digital Image Processing. New York: Academic Press, 1976.
10 J. Russ, Computer-Assisted Microscopy: The Measurement and Analysis of Images. New York,
NY: Plenum Press, 1990.
11 M. Hu, “Visual pattern recognition by moment invariants,” IRE Transactions on Information
Theory, vol. IT-8, pp. 179-187, 1962.
12 R. Gonzalez and P. Wintz, Digital Image Processing. Reading, MA: Addison-Wesley, second ed.,
1987.
13 A. Papoulis, Probability, Random Variables, and Stochastic Processes. New York: McGraw Hill,
1965.
14 E. Hall, “A survey of preprocessing and feature extraction techniques for radiographic images,”
IEEE Transactions on Computers, vol. C-20, no. 9, 1971.
15 D. Ballard and C. Brown, Computer Vision. Englewood Cliffs, NJ: Prentice-Hall, 1982.
16 R. Haralick, K. Shanmugan, and I. Dinstein, “Textural features for image classification,” IEEE
Transactions on Systems, Man, and Cybernetics, vol. SMC-3, pp. 610-621, Nov. 1973.
17 R. Haralick, “Statistical and structural approaches to textures,” Proceedings of the IEEE, vol. 67,
pp. 786-804, May 1979.
18 J. Weszka, C. Dyer, and A. Rosenfeld, “A comparative study of texture measures for terrain
classification,” IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-6, pp. 269-285, Apr.
1976.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title
Chapter 11
Neural Networks and Cellular Automata
----------- 11.1. Introduction
Artificial neural networks (ANNs) and cellular automata have been successfully employed to solve a variety
of computer vision problems [1, 2]. One goal of this chapter is to demonstrate that image algebra provides an
ideal environment for precisely expressing current popular neural network models and their computations.
Artificial neural networks (ANNs) are systems of dense interconnected simple processing elements. There
exist many different types of ANNs designed to address a wide range of problems in the primary areas of
pattern recognition, signal processing, and robotics. The function of different types of ANNs is determined
primarily by the processing elements’ pattern of connectivity, the strengths (weights) of the connecting links,
the processing elements’ characteristics, and training or learning rules. These rules specify an initial set of
weights and indicate how weights should be adapted during use to improve performance.
The theory and representation of the various ANNs is motivated by the functionality and representation of
biological neural networks. For this reason, processing elements are usually referred to as neurons while
interconnections are called axons and/or synaptic connections. Although representations and models may
differ, all have the following basic components in common:
(a) A finite set of neurons a(1), a(2),…, a(n) with each neuron a(i) having a specific neural value at
time t which we will denote by at(i).
(b) A finite set of axons or neural connections W = (wij), where wij denotes the strength of the
connection of neuron a(i) with neuron a(j).
where ¸ is a threshold and f a hard limiter, threshold logic, or sigmoidal function which introduces a
nonlinearity into the network.
It is worthwhile noting that image algebra has suggested a more general concept of neural computation than
that given by the classical theory [3, 4, 5, 6].
Cellular automata and artificial neural networks share a common framework in that the new or next stage of a
neuron or cell depends on the states of other neurons or cells. However, there are major conceptual and
physical differences between artificial neural networks and cellular automata. Specifically, an n-dimensional
cellular automaton is a discrete set of cells (points, or sites) in . At any given time a cell is in one
of a finite number of states. The arrangement of cells in the automaton form a regular array, e.g., a square or
hexagonal grid.
As in ANNs, time is measured in discrete steps. The next state of cell is determined by a spatially and
temporally local rule. However, the new state of a cell depends only on the current and previous states of its
neighbors. Also, the new state depends on the states of its neighbors only for a fixed number of steps back in
time. The same update rule is applied to every cell of the automaton in synchrony.
Although the rules that govern the iteration locally among cells is very simple, the automaton as a whole can
demonstrate very fascinating and complex behavior. Cellular automata are being studied as modeling tools in
a wide range of scientific fields. As a discrete analogue to modeling with partial differential equations,
cellular automata can be used to represent and study the behavior of natural dynamic systems. Cellular
automata are also used as models of information processing.
In this chapter we present an example of a cellular automaton as well as an application of solving a problem
using cellular automata. As it turns out, image algebra is well suited for representing cellular automata. The
states of the automata (mapped to integers, if necessary) can be stored in an image variable
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title As mentioned in the introduction, neural networks have four common components. The specifics for the
Hopfield net presented here are outlined below.
1. Neurons
The Hopfield neural network has a finite set of neurons a(i), 1 d i d N, which serve as processing units.
----------- Each neuron has a value (or state) at time t denoted by at(i). A neuron in the Hopfield net can be in one
of two states, either -1 or +1; i.e., at(i) {-1, + 1}.
2. Synaptic Connections
The permanent memory of a neural net resides within the interconnections between its neurons. For
each pair of neurons, a(i) and a(j), there is a number wij called the strength of the synapse (or
connection) between a(i) and a(j). The design specifications for this version of the Hopfield net require
that wij = wji and wii = 0 (see Figure 11.2.1).
Figure 11.2.1 Synaptic connections for nodes a(i) and a(j) of the Hopfield neural network.
3. Propagation Rule
The propagation rule (Figure 11.2.2) defines how states and synaptic strengths combine as input to a
neuron. The propagation rule Ät(i) for the Hopfield net is defined by
4. Activation Function
The activation function f determines the next state of the neuron at+1(i) based on the value Ät(i)
calculated by the propagation rule and the current neuron value at(i) (see Figure 11.2.2). The activation
function for the Hopfield net is the hard limiter defined below.
Figure 11.2.2 Propagation rule and activation function for the Hopfield network.
The patterns used for this version of the Hopfield net are N-dimensional vectors from the space P = {-1, 1}N.
Let denote the kth exemplar pattern, where 1 d k d K. The dimensionality of the
pattern space determines the number of nodes in the net. In this case the net will have N nodes a(1), a(2), …,
a(N). The Hopfield net algorithm proceeds as outlined below.
Step 1. Assign weights to synaptic connections.
This is the step in which the exemplar patterns are imprinted onto the permanent memory of the net.
Assign weights wij to the synaptic connections as follows:
Note that wij = wji, therefore it is only necessary to perform the computation above for i < j.
Step 2. Initialize the net with the unknown pattern.
At this step the pattern that is to be classified is introduced to the net. If p = (p1, p2, …, pN) is the
unknown pattern, set
Continue this process until further iteration produces no state change at any node. At convergence, the
N-dimensional vector formed by the node states is the exemplar pattern that the net has associated with the
input pattern.
Step 4. Continue the classification process.
To classify another pattern, repeat Steps 2 and 3.
Example
As an example, consider a communications system designed to transmit the six 12 × 10 binary images 1, 2, 3,
4, 9, and X (see Figure 11.2.3). Communications channels are subject to noise, and so an image may become
garbled when it is transmitted. A Hopfield net can be used for error correction by matching the corrupted
image with the exemplar pattern that it most resembles.
A binary image b {0,1} X, where , can be translated into a pattern p = (p1, p2, …, p120), and
vice versa, using the relations
and
The six exemplar images are translated into exemplar patterns. A 120-node Hopfield net is then created by
assigning connection weights using these exemplar patterns as outlined earlier.
The corrupted image is translated into its pattern representation p = (p1, p2, …, pN) and introduced to the
Hopfield net (a0(i) is set equal to pi, 1 d i d N). The input pattern evolves through neuron state changes into
the pattern of the neuron states at convergence . If the net
converges to an exemplar pattern, it is assumed that the exemplar pattern represents the true (uncorrupted)
image that was transmitted. Figure 11.2.4 pictorially summarizes the use of a Hopfield net for binary pattern
classification process.
Let be the image used to represent neuron states. Initialize a with the unknown image
pattern. The weights of the synaptic connections are represented by the template which is
defined by
where is the ith element of the exemplar for class k. Let f be the hard limiter function defined earlier. The
image algebra formulation for the Hopfield net algorithm is as follows:
Asynchronous updating of neural states is required to guarantee convergence of the net. The formulation
above does not update neural states asynchronously. The implementation above does allow more parallelism
and hence increased processing speed. The convergence properties using the parallel implementation may be
acceptable; if so, the parallelism can be used to speed up processing.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
The Hopfield net restricted to its set of exemplars can be represented by the set of ordered pairs
A properly designed Hopfield net should also take an input pattern from that is not an exemplar pattern
to the exemplar that best matches the input pattern.
A bidirectional associative memory (BAM) is a generalization of the Hopfield net [8, 9]. The domain and the
range of the BAM transformation need not be of the same dimension. A set of associations
is imprinted onto the memory of the BAM so that
That is, for 1 d k d K. For an input pattern a that is not an element of {ak : 1 d k d K} the
BAM should converge to the association pair (ak, bk) for the ak that best matches a.
The components of the BAM and how they interact will be discussed next. The bidirectional nature of the
BAM will be seen from the description of the algorithm and the example provided.
The three major components of a BAM are given below.
1. Neurons
Unlike the Hopfield net, the domain and the range of the BAM need not have the same dimensionality.
Therefore, it is necessary to have two sets of neurons Sa and Sb to serve as memory locations for the
input and output of the BAM
The state values of a(m) and b(n) at time t are denoted at(m) and bt(n), respectively. To guarantee
convergence, neuron state values will be either 1 or -1 for the BAM under consideration.
2. Synaptic Connection Matrix
Note the similarity in the way weights are assigned for the BAM and the Hopfield net.
3. Activation Function
The activation function f used for the BAM under consideration is the hard limiter defined by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The algorithm for the BAM is as outlined in the following four steps.
Step 1. Create the weight matrix w.
The first step in the BAM algorithm is to generate the weight matrix w using the formula
-----------
The neurons in Sb should be assigned values randomly from the set {-1, 1}, i.e.,
then calculate the next state values for the neurons in Sa using
The alternation between the sets Sb and Sa used for updating neuron values is why this type of
associative memory neural net is referred to as “bidirectional.” The forward feed (update of Sb)
followed by the feedback (update of Sa) improves the recall accuracy of the net.
Continue updating the neurons in Sb followed by those in Sa until further iteration produces no state
change for any neuron. At time of convergence t = c, the association that the BAM has recalled is (a,
b), where a = (ac(1), ac(2), …, ac(M)) and b = (bc(1), bc(2), …, bc(N)).
Step 4. Continue classification.
To classify another pattern repeat Steps 2 and 3.
Example
Figure 11.3.1 shows the evolution of neuron states in a BAM from initialization to convergence. Three
lowercase-uppercase character image associations (a, A), (b, B), and (c, C) were used to create the weight
matrix w. The lowercase characters are 12 × 12 images and the uppercase characters are 16 × 16 images. The
conversion from image to pattern and vice versa is done as in the example of Section 11.2. A corrupted “a” is
input onto the Sa neurons of the net.
Let a and be the image variables used to represent the state for neurons
in the sets Sa and Sb, respectively. Initialize a with the unknown pattern. The neuron values of Sb are
where is the mth component of ak and is the nth component of bk in the association pair (ak, bk). The
activation function is denoted f.
The image algebra pseudocode for the BAM is given by
which is equivalent to
Therefore, imprinting the association pair (a, b) onto the memory of a BAM also imprints (aC, bC) onto the
memory. The effect of this is seen in Block 5 of Figure 11.3.2.
or
The Hamming net [7, 10] partitions a pattern space into classes Cn, 1 d n d N. Each class
is represented by an exemplar pattern en Cn. The Hamming net takes as input a pattern
and assigns it to class Ck if and only if
That is, the input pattern is assigned to the class whose exemplar pattern is closest to it as measured by
Hamming distance. The Hamming net algorithm is presented next.
There is a neuron a(n) in the Hamming net for each class Cn, 1 d n d N (see Figure 11.4.1). The weight tij
assigned to the connection between neurons a(i) and a(j) is given by
where 0 < < and 1 d i, j d N. Assigning these weights is the first step in the Hamming algorithm.
When the input pattern p = (p1, p2, …, pM) is presented to the net, the neuron value a0(n) is
set equal to the number of component matches between p and the exemplar en. If
is the nth exemplar, a0(n) is set using the formula
where
The next state of a neuron in the Hamming net is given by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title The next state of a neuron is calculated by first decreasing its current value by an amount proportional to the
sum of the current values of all the other neuron values in the net. If the reduced value falls below zero then
the new neuron value is set to 0; otherwise it assumes the reduced value. Eventually, the process of updating
neuron values will lead to a state in which only one neuron has a value greater than zero. At that point the
neuron with the nonzero value, say ac(k) ` 0, represents the class Ck that the net assigns to the input pattern. At
----------- time 0, the value of a(k) was greater than all the other neuron values. Therefore, k is the number of the class
whose exemplar had the most matches with the input pattern.
Figure 11.4.2 shows a six-neuron Hamming net whose classes are represented by the exemplars 1, 2, 3, 4, 9,
and X. The Hamming distance between the input pattern p and each of the exemplars is displayed at the
bottom of the figure. The number of matches between exemplar en and p is given by the neuron states a0(n), 1
d n d 6. At time of completion, only one neuron has a positive value, which is ac(5) = 37.5. The Hamming net
example assigns the unknown pattern to Class 5. That is, the input pattern had the most matches with the “9”
exemplar which represents Class 5.
Let be the image variable used to store neural states. The activation function f is as defined
where is the nth exemplar pattern, is used to initialize the net. The template
defined by
Given the input pattern the image algebra formulation for the Hamming net is as follows:
The variable c contains the number of the class that the net assigns to the input pattern.
A single-layer perceptron is used to classify a pattern p = (p1, p2, …, pm) into one of two
classes. An SLP consists of a set of weights,
and then applies the limiter function. If f g(p) < 0, the perceptron assigns p to class C0. The pattern p is
assigned to class C1 if f g(p) > 0. The SLP is represented in Figure 11.5.1.
The graph of 0 = g(x) = w0 + w1x1 + ··· + wmxm is a hyperplane that divides into two regions. This
hyperplane is called the perceptron’s decision surface. Geometrically, the perceptron classifies a pattern based
on which side the pattern (point) lies on. Patterns for which g(p) < 0 lie on one side of the decision surface
and are assigned to C0. Patterns for which g(p) > 0 lie on the opposite side of the decision surface and are
assigned to C1.
The perceptron operates in two modes — a classifying mode and a learning mode. The pattern classification
mode is as described above. Before the perceptron can function as a classifier, the values of its weights must
be determined. The learning mode of the perceptron is involved with the assignment of weights (or the
determination of a decision surface).
If the application allows, i.e., if the decision surface is known a priori, weights can be assigned analytically.
However, key to the concept of a perceptron is the ability to determine its own decision surface through a
learning algorithm [7, 11, 12]. There are several algorithms for SLP learning. The one we present here can be
found in Rosenblatt [11].
Step 2. Present pk to the SLP. Let denote the computed class number of pk, i.e.,
Step 4. Repeat Steps 2 and 3 for each element of the training set, recycling if necessary, until
convergence or a predefined number of iterations.
The constant, 0 < · d 1, regulates the rate of weight adjustment. Small · results in slow learning. However, if ·
is too large the learning process may not be able to home in on a good set of weights.
Note that if the SLP classifies pk correctly, then and there is no adjustment made to the
weights. If , then the change in the weight vector is proportional to the input pattern.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title Figure 11.5.2 illustrates the learning process for distinguishing two classes in . Class 1 points have been
plotted with diamonds and Class 0 points have been plotted with crosses. The decision surface for this
example is a line. The lines plotted in the figure represent decision surfaces after n = 0, 20, 40, and 80 training
patterns have been presented to the SLP.
-----------
To train the SLP, let be a training set. Initially, each component of the weight image
variable w is set to a random real number. The image algebra pseudocode for the SLP learning algorithm
(iterating until convergence) is given by
Comments and Observations
If the set of input patterns cannot be divided into two linear separable sets then an SLP will fail as a classifier.
Consider the problem of designing an SLP for XOR classification. Such an SLP should be defined over the
pattern space {(0, 0), (0, 1), (1, 0), (1, 1)}. The classes for the XOR problem are
The points in the domain of the problem are plotted in Figure 11.5.3. Points in Class 0 are plotted with open
dots and points in Class 1 are plotted with solid dots. A decision surface in is a line. There is no line in
that separates classes 0 and 1 for the XOR problem. Thus, an SLP is incapable of functioning as a
classifier for the XOR problem.
By piping the output of SLPs through a multivariable AND gate, a two-layer perceptron can be designed for
any class C1 whose points lie in a region that can be constructed from the intersection of half-planes. Adding a
third layer to a perceptron consisting of a multivariable OR gate allows for the creation of even more complex
region for pattern classification. The outputs from the AND layer are fed to the OR layer which serves to
union the regions created in the AND layer. An OR gate is easily implemented using an SLP.
Figure 11.6.4 shows pictorially how the XOR problem can be solved with a three-layer perceptron. The AND
layer creates two quarter-planes from the half-planes created in the first layer. The third layer unions the two
quarter-planes to create the desired classification region.
The example above is concerned with the analytic design of a multilayer perceptron. That is, it is assumed that
the classification regions in are known a priori. Analytic design is not accordant with the perceptron
concept. The distinguishing feature of a perceptron is its ability to determine the proper classification regions,
on its own, through a “learning by example” process. However, the preceding discussion does point out how
the applicability of perceptrons can be extended by combining single-layer perceptrons. Also, by first
approaching the problem analytically, insights will be gained into the design and operation of a “true”
perceptron. The design of a feedforward multilayer perceptron is presented next.
A feedforward perceptron consists of an input layer of nodes, one or more hidden layers of nodes, and an
output layer of nodes. We will focus on the two-layer perceptron of Figure 11.6.5. The algorithms for the
two-layer perceptron are easily generalized to perceptrons of three or more layers.
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title A node in a hidden layer is connected to every node in the layer above and below it. In Figure 11.6.5 weight
wij connects input node xi to hidden node hj and weight vjk connects hj to output node ok. Classification begins
by presenting a pattern to the input nodes xi, 1 d i d l. From there data flows in one direction (as indicated by
the arrows in Figure 11.6.5) through the perceptron until the output nodes ok. 1 d k d n, are reached. Output
-----------
nodes will have a value of either 0 or 1. Thus, the perceptron is capable of partitioning its pattern space into 2n
classes.
The steps that govern data flow through the perceptron during classification are as follows:
Step 4. The class c = (c1, c2, …, cn) that the perceptron assigns p must be a binary vector which, when
interpreted as a binary number, is the class number of p. Therefore, the ok must be thresholded at some
level Ä appropriate for the application. The class that p is assigned to is then c = (c1, c2, …, cn), where
ck = ÇeÄ(ok).
Step 5. Repeat Steps 1, 2, 3, and 4 for each pattern that is to be classified.
Above, when describing the classification mode of the perceptron, it was assumed that the values of the
weights between nodes had already been determined. Before the perceptron can serve as a classifier, it must
undergo a learning process in which its weights are adjusted to suit the application.
The learning mode of the perceptron requires a training set where is a pattern
and ct {0, 1} n is a vector which represents the actual class number of pt. The perceptron learns (adjusts its
weights) using elements of the training set as examples. The learning algorithm presented here is known as
backpropagation learning.
For backpropagation learning, a forward pass and a backward pass are made through the perceptron. During
the forward pass a training pattern is presented to the perceptron and classified. The backward pass
recursively, level by level, determines error terms used to adjust the perceptron’s weights. The error terms at
the first level of the recursion are a function of ct and output of the perceptron (o1, o2, …, on). After all the
error terms have been computed, weights are adjusted using the error terms that correspond to their level. The
backpropagation algorithm for the two-layer perceptron of Figure 11.6.5 is detailed in the steps which follow.
Step 1. Initialize the weights of the perceptron randomly with numbers between -0.1 and 0.1; i.e.,
Step 2. Present from the training pair (pt, ct) to the perceptron and
apply Steps 1, 2, and 3 of the perceptron’s classification algorithm outlined earlier. This completes the
forward pass of the backpropagation algorithm.
where represents the correct class of pt. The vector (o1, o2, …, on)
represents the output of the perceptron.
Step 7. Repeat Steps 2 through 6 for each element of the training set. One cycle through the training set
is called an epoch. The performance that results from the network’s training may be enhanced by
repeating epochs.
Let and o be the image variables used to store the values of the input layer and
the output layer, respectively. The template will be used to represent the weights
between the input layer and the hidden layer. Initialize w as follows:
The template represents the weights between the hidden layer and the output layer.
Initialize v by setting
and
The image algebra formulation of the backpropagation learning algorithm for one epoch is
Previous Table of Contents Next
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Title To classify p = (p1, p2, …, pl), first augment it with a 1 to form = (1, p1, p2, …, pl). The pattern p is then
assigned to the class represented by the image c using the image algebra code:
-----------
Life evolves on a grid of points . A cell (point) is either alive or not alive. Life and its
absence are denoted by state values 1 and 0, respectively. The game begins with any configuration of living
and nonliving cells on the grid. Three rules govern the interactions among cells.
(a) Survivals — Every live cell with two or three live 8-neighbors survives for the next generation.
(b) Deaths — A live cell that has four or more 8-neighbors dies due to over-population. A live cell
with one 8-neighbor or none dies from loneliness.
(c) Birth — A cell is brought to life if exactly three of its 8-neighbors are alive.
Figure 11.7.1 shows the life cycle of an organism. Grid 0 is the initial the initial state of the life-form. After
ten generations, the life-form cycles between the configurations in grids 11 and 12.
Figure 11.7.1 Life cycle of an artificial organism.
Depending on the initial configuration of live cells, the automaton will demonstrate one of four types of
behavior [14]. The pattern of live cells may:
(a) vanish in time
(b) evolve to a fixed finite size
(c) grow indefinitely at a fixed speed
(d) enlarge and contract irregularly
Let a0 be the initial configuration of cells. The next state of the automaton is given by the image algebra
formula
For this example, the next state of a point is function of the current states of the points in a von Neumann
neighborhood of the target point. The next state rules that drive the maze solving CA are
(a) A corridor point that is surrounded by three of four wall points becomes a wall point.
(b) Wall points always remain wall points.
(c) A corridor point surrounded by two or fewer wall points remains a corridor point.
The image algebra code for the maze solving cellular automaton is given by
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Table of Contents
Title
Appendix
The Image Algebra C++ Library
----------- A.1. Introduction
In 1992, the U.S. Air Force sponsored work at the University of Florida to implement a C++ class library to
support image algebra, iac++. This appendix gives a brief tour of the library then provides examples of
programs implementing some of the algorithms provided in this text.
Current information on the most recent version of the iac++ class library is available via the Worldwide
Web from URL
http://www.cise.ufl.edu/projects/IACC/.
The distribution and some documentation are available for anonymous ftp from
ftp://ftp.cise.ufl.edu/pub/ia/iac++.
The class library has been developed in the C++ language [1, 2]. Executable versions of the library have
been developed and tested for the following compiler/system combinations:
Table A.1.1 Tested iac++ platforms.
Compiler System
g++ v 2.7.0 SunOS 4.1.3
SunOS 5.2.5
Silicon Graphics IRIX 5.3
Metrowerks v6 MacOS 7.5 (PowerPC)
The library is likely to be portable to other systems if one of the supported compilers is employed. Porting to
other compilers might involve significant effort.
A.2. Classes in the iac++ Class Library
The iac++ class library provides a number of classes and related functions to implement a subset of the
image algebra. At the time of this writing the library contains the following major classes to support image
algebra concepts:
(a) IA_Point<T>
(b) IA_Set<T>
(c) IA_Pixel<P,T>
(d) IA_Image<P,T>
(e) IA_Neighborhood<P,Q>
(f) IA_DDTemplate<I>
All names introduced into the global name space by iac++ are prefixed by the string IA_ to lessen the
likelihood of conflict with programmer-chosen names.
Points
The IA_Point<T> class describes objects that are homogeneous points with coordinates of type T. The
library provides instances of the following point classes:
(a) IA_Point<int>
(b) IA_Point<double>
The dimensionality of a point variable is determined at the time of creation or assignment, and may changed
by assigning a point value having different dimension to the variable. Users may create points directly with
constructors. Functions and operations may also result in point values.
One must #include "ia/IntPoint.h" to have access to the class of points with int coordinates and
its associated operations. One must #include "ia/DblPoint.h" to have access to the class of points
with double coordinates and its associated operations.
Point operations described in the sections that follow correspond to those operations presented in Section 1.2
of this book.
Relations on Points
Let p1 and p2 represent two expressions both of type IA_Point<int> or IA_Point<double>. Note
that a relation on points is said to be strict if it must be satisfied on all of the corresponding coordinates of
two points to be satisfied on the points themselves.
Table A.2.4 Relations on Points
less than (strict) p1 < p2
less than or equal to (strict) p1 <= p2
greater than (strict) p1 > p2
greater than or equal to (strict) p1 >= p2
equal to (strict) p1 == p2
not equal to (complement of ==) p1 != p2
lexicographic comparison pointcmp (p1, p2)
Examples of Point Code Fragments
One can declare points using constructors.
point1[0] = point1[1] = 3;
// point1 == IA_Point<int> (3,3,0)
One can manipulate such points using arithmetic operations.
Sets
The C++ template class IA_Set<T> provides an implementation of sets of elements of type T. The
following instances of IA_Set are provided:
(a) IA_Set<IA_Point<int> >
(b) IA_Set<IA_Point<double> >
(c) IA_Set<bool>
(d) IA_Set<unsigned char>
(e) IA_Set<int>
(f) IA_Set<float>
(g) IA_Set<IA_complex>
(h) IA_Set<IA_RGB>
IA_Set<IA_Point<int> > and IA_Set<IA_Point<double> >, provide some capabilities
beyond those provided for other sets.
// the while loop below iterates over all the points in the
// set ps3 and writes them to the standard output device
while(iter(p)) {cout << p << "\n"; }
IA_Set<int> v1(1,2,3);
// constructs a set containing three integers.
// sets of up to 5 elements can be
// constructed in this way
IA_SetIter<int> iter(v1);
int i;
Available only for images for which T is an integer type (bool, unsigned char, or int).
Subscripting with () yields the value associated with the point p, but subscripting with [] yields a reference
to the value associated with p. One may assign such a reference a new value, thus changing the image. This
operation is potentially expensive. This is discussed at greater length in the section providing example code
fragments below.
Relations on Images
Let img1 and img2 represent two expressions both of type IA_Image<P, T>.
Table A.2.17 Relations on Images
less than img1 < img2
less than or equal to img1 <= img2
equal to img1 == img2
greater than img1 > img2
greater than or equal to img1 >= img2
not equal to complement of img1 != img2
equal to)
strictly not equal to strict_ne (img1, img2)
Image Iterators
The class IA_Pixel<P, T> supports storing of a point and value as a unit and provides the two field
selectors point and value. Instances of the IA_Pixel class are provided for the same combinations of
P and T on which IA_Image instances are provided.
Iterators over either the values or the pixels of an image can be constructed and used. The class
IA_IVIter<P, T> supports image value iteration and class IA_IPIter<P, T> supports pixel
iteration. The value iterator provides an overloading of function call with a reference argument of the
image’s range type. The pixel iterator function call overloading takes a reference argument of type
IA_Pixel<P, T>.
The first time an image value (pixel) iterator object is applied as a function, it attempts to assign its
argument the value (pixel) that is associated with the first point in the image’s domain point set. If the image
is empty, a false value is returned to indicate that it failed, otherwise a true value is returned. Each
subsequent call assigns to the argument the value (pixel) associated with the next point in the image’s
domain. If the iterator fails to find such a value (pixel) because it has exhausted the image’s domain point
set, the iterator returns a false value, otherwise it returns a true value.
Note that the composition of an image with a point-to-point mapping function takes a third argument,
namely the point set over which the result is to be defined. This is necessary because determining the set of
points over which the result image defines values would require us to solve the generally unsolvable
problem of computing the inverse of the point-map function. Thus, we require the user to specify the
domain of the result.
i1[IA_IntPoint(2,4)] = 10;
// Subscripting a pixel of i1 causes new
// storage for its value mapping to be allocated
// whether or not an assignment is actually made
The iac++ library also supports assignment to a subregion of an image with the restrict_assign
member function.
i2.restrict_assign (i1);
// For each point p in both i1 and i2's domains
// This performs assignment
// i2[p] = i1(p);
One may restrict the value of an image to a specified set of points or to the set of points containing values in
a specified set, and one may extend an image to the domain of another image.
i1 = restrict(i2, ps4);
// i1's domain will be the intersection of
// ps4 and i2.domain()
i1 = restrict(i2, IA_Set<int>(1, 2, 3));
// i1's domain will be all those
// points in i2's domain associated with value
// 1, 2, or 3 by i2.
One may reduce an image to a single value with a binary operations on the range type if the image has an
extensive domain.
int my_function(int i) { . . . }
i1 = compose (my_function, i2);
// This is equivalent to assigning i1 = i2
// and for each point p in i1.domain() executing
// i1[p] = my_function(i1(p))
Likewise, since an image is conceptually a function composition of an image with a function mapping points
to points will yield a new image. The point set of such an image cannot be effectively computed, thus one
must specify the set of points in the resulting image.
IA_Point<int>
reflect_through_origin(const IA_Point<int> &p)
{
return -p;
}
i1 = compose (i2,
reflect_through_origin,
IA_boxy_pset (-max(i2.domain()),
-min(i2.domain()));
// After executing this,
// i1(p) == i2(reflect_through_origin(p)) = i2(-p).
Input/output, arithmetic, bitwise, relational, and type conversion operations are provided on the image
classes.
// image I/O is supported for pbm, pgm, and ppm image formats
i3 = read_int_PGM ("image1.pgm");
i4 = read_int_PGM ("image3.pgm");
IA_IVIter<int> v_iter(i1);
// declares v-iter to be an iterator over the
// values of image i1
IA-IPIter<int> p_iter(i1);
// declares p_iter to be an iterator over the
// pixels of image i1
int i, sum = 0;
Neighborhood operations presented in the sections that follow are introduced in Section 1.7.
Template Constructors
Let the argument image type I an IA_DDTemplate class be IA_Image<P, T>. Let templ,
templ1, and templ2 denote objects of type IA_DDTemplate<I>. Let img denote the image to be
associated with the origin by an image algebra template. Let dim denote the dimensionality of the domain
of img. Let domain_pset denote the point set over which an image algebra template is to be defined. Let
the function I templ_func (IA_Point<int>) be the point-to-image mapping function of a
translation variant template.
Table A.2.23 Template Constructors and Assignment
construct an uninitialized IA_DDTemplate<I> ()
template
construct a copy of a IA_DDTemplate<I> (temp1)
template
construct a translation IA_DDTemplate<I> (dim, img)
invariant template IA_DDTemplate<I> (domain_pset, img)
construct a translation IA_DDTemplate<I> (domain_pset, templ_func)
variant template from a
function
assign a template value temp11 = temp12
Template operations presented in the sections that follow are introduced in Section 1.5.
i1 = read_int_PGM ("temp-file.pgm");
// read the template image
i2 = read_int_PGM ("image.pgm");
// read an image to process
//
example1.c -- Averaging Multiple Images
//
// usage: example1 file-1.pgm [file-2.pgm ... file-n.pgm]
//
// Read a sequence of images from files and average them.
// Display the result.
// Assumes input files are pgm images all having the same pointset
// and containing unsigned char values.
//
#include "ia/UcharDI.h"
#include "ia/FloatDI.h"
if (argc < 2) f {
cerr << "usage: "
<< argv[0]
<< " file-1.pgm [file-2.pgm ... file-n.pgm]"
<< endl;
abort();
}
//
// example2.c -- Composition of an Image with a Function
//
#include <iostream.h>
#include "ia/UcharDI.h"
//
// point-to-point mapping function
//
IA_Point<int>
reflect_through_origin (const IA_Point<int> &p)
{
return -p;
}
int
main ()
{
// read in image from cin
IA_Image<IA_Point<int>, u_char> img = read_uchar_PGM (cin);
return 0;
}
//
// example1.c -- Local Averaging
//
// usage: example1 < file.pgm
//
// Read an image from cin and average it locally
// with a 3x3 neighborhood.
//
#include "math.h"
#include "ia/UcharDI.h"
#include "ia/IntDI.h"
#include "ia/Nbh.h"
#include "ia/UcharNOps.h"
u_char
average (u_char *uchar_vector, unsigned num)
{
if (0 == num) {
return 0;
} else {
int sum = 0;
for (int i = 0; i < num; i++) {
sum += uchar-vector[i];
}
return u_char (irint (float (sum) / num));
}
}
//
// Reduction with sum and division by nine yields a boundary
// effect at the limits of the image point set due to the
// lack of nine neighbors.
//
display (to_uchar (sum (source_image, box) / 9));
//
// Reduction with the n-ary average function correctly
// generates average values even at the boundary.
//
display (neighborhood_reduction (source_image,
box,
source_image.domain(),
average));
}
//
// example1.c -- Averaging Multiple Images
//
// usage: example1 file-1.pgm [file2.pgm ... file-n.pgm]
//
// Read a sequence of images from files and average them.
// Display the result.
// Assumes input files are pgm images all having the same pointset
// and containing unsigned char values.
//
IA_Image<IA_Point<int>, int>
result;
IA_Set<IA_Point<int> >
accumulator_domain = source.domain();
//
// Initialize parameters for the Hough Neighborhood
//
hough_initialize (source.domain(), accumulator_domain);
//
// Map feature points to corresponding locations
// in the accumulator array.
//
accumulator = sum (source, hough_nbh, accumulator_domain);
//
// Threshold the accumulator
//
display (to_uchar(accumulator*255));
//
// Map back to see the corresponding lines in the
//source domain.
//
result = sum (hough_nbh, accumulator, source.domain());
// hough.h
//
// Copyright 1995, Center for Computer Vision and Visualization,
// University of Florida. All rights reserved.
#ifndef _hough_h_
#define _hough-h_
#include "ia/Nbh.h"
#include "ia/IntPS.h"
void
hough_initialize (IA_Set<IA_Point<int> > image_domain,
IA_Set<IA_Point<int> > accumulator_domain);
IA_Set<IA_Point<int> >
hough_function (const IA_Point<int> &r_t);
#endif
hough.c
// hough.c
//
// Copyright 1995, Center for Computer Vision and Visualization,
// University of Florida. All rights reserved.
#include "hough.h"
#include "math.h"
#include <iostream.h>
//
// hough_initialize:
//
// Initializes parameters for HoughFunction to allow a neighborhood
// to be created for given image and accumulator image point sets.
//
void
hough_initialize (IA_Set<IA_Point<int> > image_domain,
IA_Set<IA_Point<int> > accumulator_domain)
{
//
// Check to make sure the image domain and accumulator domain
// are both 2 dimensional rectangular point sets.
//
if (!image_domain.boxy() ||
!accumulator_domain.boxy() ||
image_domain.dim != 2 ||
accumulator_domain.dim != 2) {
//
// Record data necessary to carry out rho,theta to
// source pointset transformation.
//
RhoCells = accumulator_domain.sup()(0) -
accumulator_domain.info()(0) + 1;
ThetaCells = accumulator_domain.sup()(1) -
accumulator_domain.info()(1) + 1;
//
// Create sine and cosine lookup tables for specified
// accumulator image domain.
//
//
// hough_function
//
// This function is used to construct a Hough transform
// neighborhood. It maps a single accumulator cell location
// into a corresponding set of (x,y) coordinates in the
// source image.
//
IA_Set<IA_Point<int> >
hough_function (const IA_Point<int> &r_t)
{
double theta, rho;
//
// Convert accumulator image pixel location to
// correct (rho, theta) location.
//
//
// Construct vector of (x,y) points associated with (rho,theta)
// We check theta to determine whether we should make
// x a function of y or vice versa.
//
*pp = Delta +
IA_Point<int> (coord,
nint (rho -
(coord * HoughCos [r_t(1)]
/ HoughSin [r_t(1)])));
}
} else {
//
// Scan across all 1st coordinate indices
// assigning corresponding 0th coordinate indices.
//
num_points = ImageSize(1);
p_ptr = new IA_Point<int> [num_points];
*pp = Delta +
IA_Point<int> (nint (rho -
(coord * HoughSin [r_t(1)] /
HoughCos [r-t(1)])),
coord);
}
}
//
// Turn vector of points into a point set
//
Figure A.3.1 Source image (left) and associated accumulator array image (right).
Figure A.3.2 Binarized accumulator image (left) and the line associated with the identified cell (right).
A.4. References
1 B. Stroustrup, The C++ Programming Language Second Edition. Reading, MA: Addison-Wesley,
1991.
2 M. Ellis and B. Stroustrup, The Annotated C++ Reference Manual. Reading, MA:
Addison-Wesley, 1990.
3 J. Poskanzer, “Pbmplus: Extended portable bitmap toolkit.” Internet distribution, Dec. 1991.
Table of Contents
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.
Handbook of Computer Vision Algorithms in Image Algebra
by Gerhard X. Ritter; Joseph N. Wilson
CRC Press, CRC Press LLC
ISBN: 0849326362 Pub Date: 05/01/96
Search Tips
Search this book:
Advanced Search
Table of Contents
Title
INDEX
A
----------- Additive maximum 27
Additive minimum 27
Averaging of multiple images 51, 347
Averaging, local 348
B
Bidirectional associative memory (BAM) 296
Binary image boundary finding 79
Binary image operations 16
Blob coloring 161
Bounded lattice ordered group 11
Butterworth highpass filter 76
C
Cellular array computer 1
Cellular automata
life 315
maze solving 316
D
Daubechies wavelet transform 217
Diamond distance 6
Dilation 173, 183
Directional edge detection 95
Discrete differencing 81
Distance 6
Distance transforms 147
Divide-and-conquer boundary detection 116
Domain, of an image 19
Dot product 5
Dynamic programming, edge following as 119
E
Edge detection and boundary finding binary image boundaries 79
F
Fast Fourier transform 195
Floor 7
Fourier transform 189, 195
Fourier transform, centered 192
Frei-Chen edge and line detection 90
Frequency domain pattern matching 229
G
Generalized matrix product (p-product) 41
Global thresholding 125
H
Haar wavelet transform 209
Hadamard product 5
Hamming net 301
Heterogeneous algebra 4, 10
Hierarchical edge detection 104
Highpass filtering 76
Histogram 278
cumulative 280
equalization 66
modification 67
I
iac++ 321
Image operations 22
additive maximum 27
additive minimum 27
iac++ 343
left 26
linear 25
multiplicative maximum 28
multiplicative minimum 28
recursive 34
right 25
K
K-forms edge detector 106
Kirsch edge detector 93
L
Left image-template product 26
Life, game of 315
Line detection 239
Linear image transforms
M
Matched filter 225
Mathematical morphology 1
Max-min sharpening 55
Maze solving 316
Medial axis transform 145
Median filter 60
Mellin transform 237
Moment invariants 276
Moore neighborhood 7
Morphological transforms
Morphology 1
Multilevel thresholding 128
Multiplicative maximum 28
Multiplicative minimum 28
N
Neighborhood 7, 37, 340
Neural networks
Norm 6
O
Opening 178, 183
P
p-Product 41
Parameterized template 25
Partially ordered set (poset) 33
Pattern matching and shape detection
correlation 225
frequency domain matching 229
Hough transform ellipse detection 246
Hough transform for generalized shape detection 251
Hough transform line detection 239
matched filter 225
phase-only matched filter 229
rotation and scale invariant pattern matching 237
rotation invariant pattern matching 234
symmetric phase-only matched filter 229
template matching 225
multilayer 308
single-layer 305
Q
Quadtree 271
R
Recursive template 33
Region
adjacency 265
area and perimeter 257
moment invariants 276
orientation 274
position 274
symmetry 274
T
Template 23
iac++ 343
parameterized 25
recursive 33
recursive 36
Thresholding techniques
V
Value set 10, 329
Variable local averaging 53
Variable thresholding 129
von Neumann neighborhood 7
W
Wallis edge detector 89
Walsh transform 205
Wavelet transform 209, 217
Z
Zhang-Suen skeletonizing 151, 154
Table of Contents
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights
reserved. Reproduction whole or in part in any form or medium without express written permission of
EarthWeb is prohibited. Read EarthWeb's privacy statement.