UWIT: Underwater Image Toolbox For Optical Image Processing and Mosaicking in M
UWIT: Underwater Image Toolbox For Optical Image Processing and Mosaicking in M
UWIT: Underwater Image Toolbox For Optical Image Processing and Mosaicking in M
F
2
(
1
,
2
) F
1
(
1
,
2
)
= e
j(1x0+2y0)
(3)
F
1
= (x x
0
, y y
0
) (4)
Rotation Only
This same property can be exploited for images which are
rotated and scaled by representing them in a coordinate system
where scale and rotations appear as shifts. For example, when
f
2
is a rotated version of f
1
f
2
(x, y) = f
1
(xcos
0
+ y sin
0
, xsin
0
+ y cos
0
) (5)
their Fourier transforms are related by
F2(1, 2) = F1(1 cos 0+2 sin 0, 1 sin 0+2 cos 0) (6)
Using only the magnitudes of the Fourier transforms and con-
verting to polar coordinates, we see that the rotation can be rep-
resented as a shift
M
2
(, ) = M
1
(,
0
) (7)
Scaling Only
Similarly, when two images are related by a scale factor, a,
then their Fourier transforms are related by
F
2
(
1
,
2
) =
1
a
2
F
1
(
1
/a,
2
/a) (8)
Taking the logarithm of the frequency axes results in the scale
appearing as a shift (ignoring the 1/a
2
scale factor)
F
2
(log
1
, log
2
) = F
1
(log
1
log a, log
2
log a) (9)
Translation, Rotation, and Scaling
When translation, rotation, and scaling are all present be-
tween the two images, we see that representing the magnitudes
in a log-polar coordinate system results in
M
2
(, ) = M
1
(/a,
0
) (10)
M
2
(log , ) = M
1
(log log a,
0
) (11)
Rotation and scaling can now both be recovered using the
Fourier phase shifting property. After recovery of those param-
eters, image f
2
can be warped to compensate for the rotation
and scaling. Finally, the standard phase correlation technique
can be applied to recover the remaining translational offset be-
tween f
1
and f
2
.
Implementation
The raw images are all rst preprocessed using CLAHS to
minimize lighting effects present in the imagery. The image is
cosine windowed and then edge enhanced using a Laplacian of
Gaussian lter. Mapping of the Fourier magnitude from Carte-
sian to polar coordinates only requires the top half of the spectra
due to complex conjugate symmetry. Recovery of the transla-
tions and log a is achieved through normalized correlation of
the two log-polar magnitudes M
1
and M
2
. The image M
2
is
then rotated and scaled according to the results. Finally, the
warped M
2
is normalized correlated with M
1
to recover the
translational offsets.
The parameters used allow us to, theoretically, resolve ro-
tations of 0.3
i,j
(f[x + i, y + j] f
W
) (g[x + u + i, y + v + j) g
W
)
i,j
(f[x + i, y + j] f
W
)
2
i,j
(g[x + u + i, y + v + j) g
W
)
2
(12)
The lack of rich features in underwater imagery poses dif-
cult challenges for indirect feature based methods, and experi-
mental evidence suggests that direct correlation based methods
yield good results. We use local normalized correlation surfaces
calculated for each pixel to determine a dense set of correspon-
dences between images. This set of dense correspondences is
then pruned by only considering pixels which have a concave
correlation surface as reliable matches (We t a quadratic sur-
face near the correlation surface peak and analytically check for
concavity as a method of outlier rejection [8]).
Figure 3 displays mosaic results when images are mosaicked
together in a globally consistent manner utilizing all available
cross-linked image pair correspondences. Local normalized
correlation was used as the similarity measure for determining
point correspondences.
V. MULTIRESOLUTION PYRAMIDAL BASED BLENDING
Due to the rapid attenuation of light underwater, the only way
to get a large area view of the seaoor is to build up a photo-
Control image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Input image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Registered Input image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Fig. 2. Registration of underwater imagery from the Derbyshire data set using
Fourier based methods to recover scale, rotation, and translation. The recovered
parameters are
0
= 1.2
, scaling = 1.1567, x
0
= 18, y
0
= 18.
mosaic from smaller local images as seen in Figure 3. The mo-
saic technique is used to construct an image with a far larger
eld of view and greater resolution than could be obtained with
a single photograph. However, once the mosaic is generated,
differences in image intensities due to image processing or ac-
quisition can lead to clearly visible borders between images in
the mosaic. A technical problem in image representation then,
is how to join image borders so that the boundary between them
is not visible?
Fig. 3. A sequence of images which was mosaicked together in a globally con-
sistent manner utilizing all available cross-linked image pair correspondences.
To obtain good registration, especially along edges, we must compensate for
lens radial distortion. The similarity metric used for point correspondence was
local normalized correlation. The mosaic is rendered as the average of the inten-
sities of overlapping pixels. To preserve mismatches, the results are presented
without blending.
The problem can be viewed as joining two surfaces at their
border, where the gray level values of the image I(x, y) rep-
resent the surfaces height above the (x, y) plane. The goal
is to perturb each surface near the border so that they join
smoothly without distorting the original surface features too
grossly. Many methods are based upon a weighted sum tech-
nique where the size of the transition zone is an important pa-
rameter; too small of a transition zone relative to feature sizes
results in the image border still being visible, albeit blurry,
while too large of a transition zone results in a double expo-
sure effect.
We implement a multiresolution pyramidal blending ap-
proach where the images to be blended are rst decomposed
into different band-pass frequency components, merged sepa-
rately in each frequency band, and then reassembled into a sin-
gle seamless composite mosaic [9]. The idea behind this tech-
nique is that the transition zone is optimally matched for feature
sizes within each frequency band of the pyramid.
First, a Gaussian pyramid is constructed for each image
where the base level in the pyramid, G
0
, is the original image.
Each successive level in the pyramid is a low-pass ltered and
down sampled by factor of 2 version of the previous level, i.e
G
l
[i, j] =
2
m=2
2
n=2
w[m, n]G
l1
[2i + m, 2j + n] (13)
where the 5 5 generating kernel, w[m, n], is subject to the
following four constraints:
1) For computational convenience, the kernel is separable,
i.e. w[m, n] = w[m] w[n].
2) The one-dimensional function w[] is symmetric.
3) w[] is normalized, i.e.
2
i=2
w[i] = 1.
4) Finally, each level l node must contribute the same total
weight to level l + 1 nodes resulting in the constraint:
w[0] + 2 w[2] = 2 w[1].
Next, the different band-pass components are formed by gen-
erating the Laplacian pyramid. The Laplacian pyramid is gener-
ated from the Gaussian pyramid by expanding the image at the
next higher level in the pyramid to the resolution of the current
level and then subtracting them.
L
l
[i, j] = G
l
[i, j] k
2
m=2
2
n=2
G
l+1
[
i + m
2
,
j + n
2
] (14)
This results in each level of the Laplacian pyramid containing
a separate one-octave, band-pass component of the original im-
age. The two Laplacian pyramids, one for each image, are then
merged at each level of the pyramid. The resulting seamless
mosaic is then constructed by compressing the merged Lapla-
cian pyramid via
I
merged
=
N
l=0
L
l,l
merged
(15)
where N is the number of levels in the pyramid and the notation
L
l,l
implies expansion of the level L
l
, l times, up to the reso-
lution of the base level, L
0
. Figure 4 shows before and after
results of the blending of a two image mosaic.
VI. CONCLUSIONS
This paper has presented results from our efforts to develop
an extended MATLAB image processing and mosaicking tool-
box. Proven algorithms from the land-based literature have
been adapted and applied to the unique challenges of the under-
water environment. The collection of algorithms presented in
this paper group nicely into a unied framework for underwater
imaging and mosaicking work. Our hierarchical framework of
contrast limited adaptive histogram specication, Fourier based
methods for image registration, local normalized correlation
for a similarity measure, and multiresolution pyramidal based
blending compose a core suite of functionality in a unied tool-
box.
REFERENCES
[1] K. Zuiderveld, Contrast limited adaptive histogram equalization, in
Graphics Gems Iv, Paul Heckbert, Ed., vol. IV, pp. 474485. Academic
Press, Boston, Date 1994.
[2] J. Lim, Two-Dimensional Signal and Image Processing, Prentice Hall,
Englewood Cliffs, N.J., Date 1990.
[3] L.G. Brown, A survey of image registration techniques, ACM Computing
Surveys, vol. 24, no. 4, pp. 325376, December 1992.
[4] E. De Castro and C. Morandi, Registration of translated and rotated im-
ages using nite fourier transforms, IEEE Transactions on Pattern Analy-
sis and Machine Intelligence, vol. PAMI-9, no. 5, pp. 700703, September
1987.
[5] B.S. Reddy and B.N. Chatterji, An fft-based technique for translation,
rotation, and scale-invariant image registration, IEEE Transactions on
Image Processing, vol. 5, no. 8, pp. 12661271, August 1996.
[6] O. Pizarro, H. Singh, and S. Lerner, Towards image-based characteri-
zation of acoustic navigation, in IEEE/RSJ International Conference on
Intelligent Robots and Systems, October 2000, vol. 3, pp. 15191524.
(a)
(b)
Fig. 4. (a) A two image mosaic with seam. The top image is overlaid over the
bottom image. (b) The nal blended result.
[7] M. Irani and P. Anandan, Robust multi-sensor image alignment, in Sixth
International Conference on Computer Vision, 1998, January 1996, pp.
959966.
[8] R. Mandelbaum, G. Salgian, and H. Sawhney, Correlation-based estima-
tion of ego-motion and structure from motion and stereo, in Proceedings
of the Seventh IEEE International Conference on Computer Vision, 1999,
Kerkyra, Greece, September 1999, vol. 1, pp. 544550.
[9] P.J. Burt and E.H. Adelson, A multiresolution spline with application to
image mosaics, ACM Transactions of Graphics, vol. 2, no. 4, pp. 217
236, October 1983.