0% found this document useful (0 votes)
12 views7 pages

Prashan 2008

Just a paper to be uploaded

Uploaded by

Taki_ine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views7 pages

Prashan 2008

Just a paper to be uploaded

Uploaded by

Taki_ine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

University of Wollongong

Research Online

Faculty of Engineering and Information


Faculty of Informatics - Papers (Archive) Sciences

1-1-2008

Feature based stereo correspondence using moment invariant


Prashan Premaratne
University of Wollongong, [email protected]

Farzad Safaei
University of Wollongong, [email protected]

Follow this and additional works at: https://ro.uow.edu.au/infopapers

Part of the Physical Sciences and Mathematics Commons

Recommended Citation
Premaratne, Prashan and Safaei, Farzad: Feature based stereo correspondence using moment invariant
2008, 104-108.
https://ro.uow.edu.au/infopapers/1667

Research Online is the open access institutional repository for the University of Wollongong. For further information
contact the UOW Library: [email protected]
Feature based stereo correspondence using moment invariant

Abstract
Autonomous navigation is seen as a vital tool in harnessing the enormous potential of Unmanned Aerial
Vehicles (UAV) and small robotic vehicles for both military and civilian use. Even though, laser based
scanning solutions for Simultaneous Location And Mapping (SLAM) is considered as the most reliable for
depth estimation, they are not feasible for use in UAV and land-based small vehicles due to their
physicalsize and weight. Stereovision is considered as the best approach for any autonomous navigation
solution as stereo rigs are considered to be lightweight and inexpensive. However, stereoscopy which
estimates the depth information through pairs of stereo images can still be computationally expensive
and unreliable. This is mainly due to some of the algorithms used in successful stereovision solutions
require high computational requirements that cannot be met by small robotic vehicles. In our research, we
implement a feature-based stereovision solution using moment invariants as a metric to find
corresponding regions in image pairs that will reduce the computational complexity and improve the
accuracy of the disparity measures that will be significant for the use in UAVs and in small
roboticvehicles.

Keywords
Feature, based, stereo, correspondence, using, moment, invariant

Disciplines
Physical Sciences and Mathematics

Publication Details
Premaratne, P. & Safaei, F. 2008, ''Feature based stereo correspondence using moment invariant'', 4th
International Conference on Information and Automation for Sustainability. Sustainable Development
Through Effective Man-Machine Co-Existence, IEEE Region 10 and ICIAFS, Colombo, Sri Lanka, pp.
104-108.

This conference paper is available at Research Online: https://ro.uow.edu.au/infopapers/1667


Feature based Stereo Correspondence using Moment
Invariant
Prashan Premaratne Farzad Safaei
School of Electrical, Computer & Telecommunications School of Electrical, Computer & Telecommunications
Engineering, Engineering,
The University of Wollongong The University of Wollongong
North Wollongong, Australia North Wollongong, Australia
[email protected]

Abstract—Autonomous navigation is seen as a vital tool in Our approach is very much aimed at controlling small
harnessing the enormous potential of Unmanned Aerial Vehicles robotic vehicles using stereovision for depth calculation. This
(UAV) and small robotic vehicles for both military and civilian depth information will be used in control algorithms to detect
use. Even though, laser based scanning solutions for and avoid obstacles. If this depth information is to be useful,
Simultaneous Location And Mapping (SLAM) is considered as they need to be estimated in realtime which requires any
the most reliable for depth estimation, they are not feasible for stereovision algorithm to be less computationally expensive.
use in UAV and land-based small vehicles due to their physical
size and weight. Stereovision is considered as the best approach With our great success in using moment invariants for
for any autonomous navigation solution as stereo rigs are recognizing hand gestures, moment invariants can be of great
considered to be lightweight and inexpensive. However, use in finding corresponding matching regions in stereo pairs
stereoscopy which estimates the depth information through pairs [7-8]. Moment invariants are invariant to rotation, scale and
of stereo images can still be computationally expensive and shift and the rotation invariant property is especially beneficial
unreliable. This is mainly due to some of the algorithms used in to the stereo correspondence problem as any misalignment or
successful stereovision solutions require high computational non-flat ground conditions can create slightly rotated versions
requirements that cannot be met by small robotic vehicles. In our of any scene in any one of the cameras. In our approach, we
research, we implement a feature-based stereovision solution rely on edge-corner detection algorithms such as Harris corner
using moment invariants as a metric to find corresponding
detection [8] to produce reliable feature points. This will result
regions in image pairs that will reduce the computational
in fewer points of interest compared to area-based techniques
complexity and improve the accuracy of the disparity measures
that will be significant for the use in UAVs and in small robotic
[9-18]. An image can be separated to a collection of blocks and
vehicles. and can be marked as candidates or not depending on whether
they occupy corners (or edges). These blocks can be matched
Keywords—moment invariants, feature-based stereo with the help of moment invariants and disparity of the
correspondence, sum of squared difference (SSD) identified features can be simply calculated.
In this paper, section II details the general stereo matching
I. INTRODUCTION approaches and the moment invariant based technique is
Stereo vision is a mechanism for obtaining depth presented in detail in section III. This is followed by our
information from digital images. The challenge in stereovision experimental results and the conclusion.
is how to find corresponding points in the left image and the
right image, known as the correspondence problem. Once a II. STEREO MATCHING APPROACHES
pair of corresponding points is found the depth can be In area based stereo matching, for a given pair of stereo
computed using triangulation. There are two prominent images, the corresponding points are supposed to lie on the
approaches to finding such corresponding pairs namely, area- epipolar lines [19]. Area based techniques rely on the
based and feature-based techniques. In the area based assumption of surface continuity, and often involve some
techniques, every pixel in a designated area of one image is correlation-measure to construct a disparity map with an
compared with the pixels in the same row of the other image. estimate of disparity for each point visible in the stereo pair.
This is done with few constraints such as maximum disparity to Area based techniques produce much denser disparity maps,
avert any false matches. Some of the well-known techniques in which is critical in obstacle detection and avoidance.
this approach are Hierarchical Block Matching [1], Census [2],
Correlation Matching [3-4] and Zitnick-Kanade (Cooperative Since corresponding points are the same real point in the
Algorithm for Stereo Matching and Occlusion Detection) [5-6] captured scene projected into left and right images, we can
algorithms. The feature-based methods rely on finding special assume that their surroundings in both pictures would be quite
features in corresponding pairs and may result in fewer depth similar. Area-based methods use this similarity for
values lowering the computational complexity. corresponding points detection [9-13]. It is computed from the
difference in local neighborhoods (usually a constant size

978-1-4244-2900-4/08/$25.00 ©2008 IEEE ICIAFS08


104
Stereopsis Approaches

Area based techniques Feature based techniques

Moment Invariant based


Block Matching technique
Algorithm
Monogenic Phase Algorithm
Correlation
Algorithm

Census Algorithm

Zitnick-Kanade
Stereo Algorithm

Figure 1. Stereopsis Approaches

Feature-based stereo matching techniques focus on local


intensity variations and generate depth information only at
square) of the points. Computing the similarity of two points is points where features are detected. In general, feature-based
the elementary step in the method and cannot be accelerated. techniques provide more accurate information in terms of
The main problem is to look for the corresponding point in the locating depth discontinuities and thus achieve fast and robust
picture. The naive area-based algorithm chooses a point from matching. However they yield very sparse range maps and may
the first image, and run through all the points in the second have to undergo expensive feature extraction process. Edge
image to find its corresponding point. This inefficient process elements, corners, line segments, and curve segments are
can be accelerated by restraining the search area to a specific features that are robust against the change of perspective, and
region around the corresponding pixel by specifying a they have been widely used in many stereovision work.
maximum disparity. The most efficient method is adapted from Features such as edge elements and corners are easy to detect
Epipolar Geometry known as epipolar constraint. however, they may suffer from occlusion whereas line and
Area based methods are considered to be computationally curve segments require extra computation time, but are more
expensive due to the exhaustive nature of the metrics being robust against occlusion. Higher level image features such as
used. Sum of Squared Difference (SSD), Sum of Absolute circles, ellipses, and polygonal regions have also been used as
Difference, Sum of Sum of Squared Difference (SSSD) and features for stereo matching. These features are, however,
Cross Correlation based metrics use every pixel to calculate restricted to images of indoor scenes. Nevertheless, feature
these metrics making them exhaustive. based techniques are associated with computation of conjugate

Figure 2. Left and Right stereo images with ‘corners’ marked using Harris corner detection

105
[20, 21]. Essentially, the algorithm derives a number of self-
characteristic properties from a binary image of an object.
These properties are invariant to rotation, scale and translation.
Let f(i,j) be a point of a digital image of size M×N (i = 1,2, …,
M and j = 1,2, …, N). The two dimensional moments and
central moments of order (p + q) of f(i,j), are defined as:
M N
m pq = ∑∑ i p j q f (i, j ) (1)
i =1 j =1

M N
U pq = ∑∑ (i − i ) p ( j − j ) q f (i, j ) (2)
i =1 j =1

m m
Figure 3. Left image is divided into blocks of size Where i = 10 and j = 01
m m
20x20 pixels 00 00

φ1 = η 20 + η 02 (3)
pairs with subpixel accuracy. They will also include object-
dependent constraints in the solution of the correspondence
problem such as ‘corners’ when using Harris corner detection φ 2 = (η 20 − η 02 ) 2 + 4η11 2 (4)
algorithm.
φ3 = (η 30 − 3η12 ) 2 + (3η 21 − η 03 ) 2 (5)
III. MOMENT INVARIANT BASED STEREO MATCHING
Using invariant moments to locate corresponding features φ 4 = (η 30 + η12 ) 2 + (η 21 + η 03 ) 2 (6)
in stereo pairs will be less computationally intensive as the
number of block comparisons will depend on the disparity
constraint as well as number of features. In our approach, Where η pq is the normalized central moments defined by:
‘corners’ will be used as features as shown in Fig. 2. Then the
left image is divided into 20x20 pixel blocks and will be
marked with occupying features (Fig. 3). Then the blocks U pq
containing features will be used to calculate the first 4 moments η pq =
using equations 3 to 6. Even though the moment invariants can U 00r
calculate upto 7 such moments, 4 moments will be adequate to
uniquely represent a square. Using epipolar and disparity B. Example of Invariant Properties
constraints, we can now evaluate the adjoining 20x20 pixel
blocks in the Right image for matches for the blocks containing Fig. 4 shows an image containing letter ‘A’, rotated and
the features. The ‘closeness’ of these moments will be decided scaled, translated and noisy versions of it. There respective
using a threshold that is dependent on the image scenery. When moment invariants calculated using formulas using Equations
such blocks are identified, the simple depth calculation formula (1) to (6) and 3 other equations not defined in this paper are
can be used to calculate the depth to the identified feature as shown in Table 1.
follows: It is obvious from Table 1 that the algorithm produces
the same result for the first three orientations of letter ‘A’
despite the different transformations applied upon them. There
f is only one value, i.e. Φ1, displays a small discrepancy of 5.7%
d =b
D due to the difference in scale. The other values of the three
figures are effectively the same for Φ2, Φ3, Φ4, Φ5, Φ6 and
Where d is the depth the object from the camera plane, f Φ7. The last letter, however, reveals the drawback of the
being the focus of the camera, D the disparity a b is the algorithm: it is susceptible to noise. Specifically, the added
baseline distance. Here the underlying assumption is that the noisy spot in the letter has changed the entire moment
epipolar lines run parallel to the image lines, so that invariants set. This drawback suggests that moment invariants
corresponding points lie on the same image lines. can only be applied on noise-free images in order to achieve
the best results. Since the algorithm is firmly effective against
A. Moment Invariants transformations, a simple classifier can exploit these moment
Moment invariants algorithm has been known as one of the invariants values to differentiate as well as recognize the letter
most effective methods to extract descriptive feature for object ‘A’ from other letters. In this paper, we have used these first 4
recognition applications. The algorithm has been widely moment invariants to compare similar regions rather than
applied in classification of aircrafts, ships, ground targets, etc calculating similarity between every pixel using correlation.

106
realtime applications that we are interested in. Fig. 5
summarizes the major steps in the proposed algorithm.
The major advantage of local approaches presented here is
speed and suitability for hardware implementation. Global
optimization algorithms commonly require 2 to 3 orders of
magnitude more time than even the software implementations
Figure 4. Letter ‘A’ in different orientations of local methods.
TABLE I. MOMENT INVARIANT VALUES FOR ‘A’
V. SUMMARY
In many respects feature based algorithms are established
as the most robust way to implement stereo vision algorithms
for the industrial-type stereo problems. The advantages offered
by using features are that feature-based representations contain
desirable statistical properties and provide algorithmic
flexibility to the programmer. The flexibility being that
algorithmic constraints can be applied explicitly to the data
structures rather than implicitly as with area based correlation
techniques. In particular the use of ‘corners’ leads to algorithms
which are as locally accurate as the precision to which the
edges can be extracted.
In the algorithm, each block containing at least one ‘feature’
will be compared with the other blocks in the same horizontal Even though, feature-based techniques do not produce
region of the other image. Since only 4 values are compared denser disparity map, their values are more accurate than the
per block against 400 such comparisons in the case of area-based techniques. Some of the reasons for this is that
correlation, the computational saving will be immense and will presence of shadows produce erroneous results; some surfaces
lead to faster depth computations. The algorithm is moderately were non-uniformly reflecting light from; backgrounds are
easy to implement and requires only a moderate computational usually flat single-colored surfaces and some parts of the first
effort from the CPU. Feature extraction, as a result, can be image were occluded in one of the images .
progressed rapidly and efficiently. The summary of the
proposed technique is presented in Figure 5.
If certain blocks do not contain any features, the search area
in the other image (Right) can be restricted using maximum
disparity constraint. This will further cut down the
computational requirements as opposed to area-based
correlation techniques.

IV. EXPERIMENTAL RESULTS


We ran the proposed algorithm on 100 stereo pairs
generated using a BumbleBeeTM stereo rig mounted on a
mobile robot producing 320x240 pixel frames. The path the
robot covered had special markers to provide feature points as
most of the cubicles of the indoor setup had monotonous flat
featureless walls and partitions. This was followed by running
correlation based stereo matching algorithm using SSD and
SAD metrics. The results are shown in Fig. 6. The processing
system comprised of a Pentium 4 system running at 2GHz with
2GB of RAM. The system was capable of processing 10 frames
per second for both moment invariant based approach and
correlation based technique using SAD metric however, it only
managed 5 frames per second using SSD. This was expected as
the SSD involved more computational complexity compared to
SAD. However, it should be pointed out that the proposed
approach computed handful of disparity values. We also used
the Zitnick-Kanade algorithm to estimate the disparity even
though the results are not presented due to the inability of the
algorithm to run anywhere near realtime with our modest
processing power. Since this algorithm expects multiple Figure 5. Summary of the proposed technique
iterations to refine its estimates, it is simply not useful for
Figure 5. Summary of the proposed technique

107
[5] C. Zitnick and T. Kanade, “A Cooperative Algorithm for Stereo
Matching and Occlusion Detection,” tech. report CMU-RI-TR-99-35,
Robotics Institute, Carnegie Mellon University, October, 1999.
[6] C. Zitnick and T. Kanade, “A Cooperative Algorithm for Stereo
Matching and Occlusion Detection,” IEEE Trans. Pattern Analysis and
Machine Intellig., Vol. 22-7, pp. 675-684, 2000.
[7] P. Premaratne,. and Q. Nguyen, Consumer electronics control system
based on hand gesture moment invariants, IET Computer Vision, vol. 1-
1, pp. 35-41, 2007.
[8] P. Premaratne, F. Safaei and Q. Nguyen, Moment Invariant Based
Control System Using Hand Gestures: Book Intelligent Computing in
Signal Processing and Pattern recognition, Book Series Lecture Notes in
Control and Information Sciences, Publisher Springer Berlin /
Heidelberg, Vol. 345/2006, pp. 322-333, 2006.
[9] Harris C. and Stephens M. “A Combined Corner and Edge Detector”.
Proc. 4th Alvey Vision Conference, pp. 147-151, 1988
[10] L. Di Stefano, M. Marchionni, S. Mattoccia, G. Neri, “A Fast Area-
Based Stereo Matching Algorithm”, 15th IAPR/CIPRS International
Conference on Vision Interface, Calgary, Canada, May 27-29, 2002.
[11] A. Fusiello, V. Roberto, E. Trucco, “Experiments with a new Area-
Based Stereo Algorithm”, International Conference on Image Analysis
and Proceedings, Florence 1997
[12] Barnard S.T., Thompson W.B., “Disparity Analysis of Images”, IEEE
Trans. of PAMI, vol. PAMI-2, pp. 4, 1980.
Figure 6. Comparison of Moment Invariant technique with correlation
method using SSD and SAD metrics [13] Hannah M.J., “Bootstrap Stereo”, Proc. Image Understanding
Workshop, 1980.
We managed to demonstrate that feature-based techniques [14] Hannah M.J., “SRI’s Baseline Stereo System”, Proc. of DARPA Image
relying on moment invariants for matching can process a frame Understanding Workshop, pp. 149-155,1985.
in the order of tenths of seconds in software implementations. [15] Hannah M.J., “A System for Digital Stereo Image Matching”.
This implies that the algorithm can comfortably reach higher Photogrammatic Engineering and RemoteSensing pp. 1765-1770, 1989.
video rates using DSP and FPGA implementations [22]. At the [16] Peter Burt, Bela Julesz, “Modifications of the Classical notion of
moment, there is no technique for achieving simultaneously the Panum’s Fusional Area”, Perception, vol. 9, pp. 671-682, 1980.
high quality range obtained from global optimization with the [17] Lane R.A., Thacker N.A., Seed N.L., “Stretch-Correlation as a Real-
fast run-times of local schemes. Time Alternative to Feature Based Stereo Matching Algorithms”, Image
and Vision Comp. Journal, vol. 12 No. 4, May 1994.
[18] Lane R.A., Thacker N.A., Seed N.L., Ivey P.A., “A Stereo Vision
REFERENCES Processor”, Proc. of IEEE Custom IntegratedCircuits Conference, 1995.
[1] Q. Koschan, V. Rodehorst and K. Spiller, “Color stereo vision using [19] J. Weng, Camera calibration with distortion models and accuracy
hierarchical block matching and active color illumination”, Proc. 13th evaluation. IEEE Trans. Patt. Anal. Machine Intel., Vol. 14, pp. 965-980,
Int. Conf. Pattern Recog., Vol. 1, pp. 835-839, 1996. 1992.
[2] R. Zabih and J. Woodfill, “Non-parametric local transformers for [20] P. Premaratne, “ISAR ship classification; An alternative approach”,
computing visual correspondence”, Third Eurpean Conf. Computer CSSIP-DSTO Internal Publication, Australia, March, 2003.
Vision”, 1994. [21] Q. Zhongliang, and W. Wenjun, “Automatic ship classification by
[3] J. C. M. van Beek and J. J. Lukkien, “A parallel algorithm for stereo superstructure moment invariants and two-stage classifier”, ICCS/ISITA
vision based on correlation”, Proc. 3rd Int. conf. High Performance '92 Comm. on the Move, pp. 544-547, 1992.
Computing”, 1996. [22] J. van der Horst, R. van Leeuwen, H. Broers, R. Kleihorst, P. Jonker. A
[4] H. Hirschm¨uller, P.R Innocent, J.M. Garibaldi, “Real-Time Correlation- Real-Time Stereo SmartCam, using FPGA, SIMD and VLIW. Proc. Of
Based Stereo Vision with Reduced Border Errors”, Int. Journal of the 2nd Workshop on Applications of Computer Vision May 12, 2006.
Computer Vision, Vol. 47(1/2/3), pp. 229-246, 2002.

108

You might also like