0% found this document useful (0 votes)
65 views25 pages

Acknowledgement: Sift:Scale Invariant Feature Transform

This document summarizes the Scale Invariant Feature Transform (SIFT) algorithm. It begins by acknowledging those who helped with the project. It then provides an abstract stating that the algorithm extracts distinctive invariant features from images that can be used for matching between objects or scenes. These features are invariant to scale and rotation and robust to changes in viewpoint, noise, and illumination. The features are highly distinct, allowing correct matching against large databases. The document is organized into chapters that discuss scale space detection to find features invariant to scale, accurately localizing keypoints, assigning orientations, describing local image gradients around keypoints, and conclusions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views25 pages

Acknowledgement: Sift:Scale Invariant Feature Transform

This document summarizes the Scale Invariant Feature Transform (SIFT) algorithm. It begins by acknowledging those who helped with the project. It then provides an abstract stating that the algorithm extracts distinctive invariant features from images that can be used for matching between objects or scenes. These features are invariant to scale and rotation and robust to changes in viewpoint, noise, and illumination. The features are highly distinct, allowing correct matching against large databases. The document is organized into chapters that discuss scale space detection to find features invariant to scale, accurately localizing keypoints, assigning orientations, describing local image gradients around keypoints, and conclusions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

SIFT:SCALE INVARIANT FEATURE TRANSFORM

ACKNOWLEDGEMENT

The success of any task depends largely on the encouragement and guidelines of
many others. I take this opportunity to express my gratitude to the people who have
been instrumental in the successful completion of this technical seminar.

I thank Dr. J Surya Prasad Director/Principal of PESIT BSC for not only pro-
viding with the excellent facilities but also for offering his unending encouragement
that has made this technical seminar a success

I would like to take this opportunity to thank my HOD/Guide Prof. Chiran-


jeevi for his tremendous support and help.Without his encouragement and guidance
this would not have materialized.

I express my sincere gratitude to project coordinators, Mrs.Vidya T V, Mr.Shivaraj


J Karki, Mr. K Rama Murthy, Assistant Professors, Department of Electronics
and Communication for having constantly monitoring the development of the seminar
and setting up precise deadlines.

Department of ECE, PESIT-BSC Page i


SIFT:SCALE INVARIANT FEATURE TRANSFORM

ABSTRACT

Here we present a algorithm for extracting distinctive invariant features from images
which can be used to perform consistent matching between different object or scene.
The features or key points are invariant to image scale and rotation, and are robust
matching across a substantial range of affine distortion, change in 3D viewpoint,
addition of noise, and change in illumination.

The key points are highly distinct , in a way that a single feature can be correctly
matched with high probability against large database of key points from many
images. This also helps us to describes an approach to using these features for
object recognition.

Categories and Subject Descriptions: Image Processing Key words:

Distinct Invariant Features, Affine Distortion

Department of ECE, PESIT-BSC Page ii


SIFT:SCALE INVARIANT FEATURE TRANSFORM

TABLE OF CONTENT

CHAPTER DESCRIPTION Page no.

1 Inroduction 1

2 Related Research 3

3 Detection of scale space 4

3.1 Local Extrema Detection 7

3.2 Sampling Frequency in Scale 8

3.3 Sampling in Spatial Domain 11

4 Accurate Keypoint Localization 12

4.1 Initial Outlier Rejection 12

4.2 Further Outlier Rejection 15

5 Orientation Assignment 17

6 Local Image descriptor 19

6.1 Extraction Of Local Image Descriptor at Key Points 20

6.2 Descriptor Testing 22

7 Conclusion 23

Department of ECE, PESIT-BSC Page iii


SIFT:SCALE INVARIANT FEATURE TRANSFORM

LIST OF FIGURES

FIGURE DESCRIPTION Page no.

3.1 Difference Of Gaussian 5

3.2 Detection of maxima and minima of DOG 7

3.3 Scales per Octave 8

3.4 Number of scales for calculating DOG 9

3.5 Sampling in Spatial domain 11

4.1 Key point Localization 13

5.1 Orientation Histogram 17

6.1 key point descriptor 20

6.2 Optimum width for Descriptor 22

Department of ECE, PESIT-BSC Page iv


SIFT:SCALE INVARIANT FEATURE TRANSFORM

CHAPTER1

INTRODUCTION

Image feature matching is a primary viewpoint of many problems in computer


vision, including object or scene recognition, solving for 3D structure from multiple
images, stereo correspondence, and motion tracking

Here we describes image key points that have many properties that make them
suitable for matching different images of an object or scene. The key points are
invariant to image scaling and rotation, and partially invariant to change in
illumination and 3D camera viewpoint. An important characteristic of these
features is that the relative positions between them in the original scene shouldn’t
change from one image to another

They are well localized in both the spatial and frequency domains, reducing the
probability of distortion by noise. Numbers of feature can be extracted from images
with efficient algorithms. In addition, the key points are distinctive, which allows a
single feature to be correctly matched against a large database of features, providing
a basis for object recognition.

Scale Space Detection: Here we search for key points over all scales and image
locations. It is implemented efficiently by using difference-of-Gaussian function to
identify probable key points that are invariant to scale and orientation

Keypoint localization:: At each candidate location, a detailed fit to nearby data


for accurate location, scale, and ratio of principle of curvature. This step removes
key points with low contrast and helps us to select key points which are stable.

Orientation assignment: In this step one or more orientations are assigned to

Department of ECE, PESIT-BSC Page 1


SIFT:SCALE INVARIANT FEATURE TRANSFORM

each key point location based on local image gradient directions. All future
operations that have to be performed on image are done on the basis gradient
direction .Here image data has been transformed relative to the assigned orientation,
scale, and location for each feature, thereby providing invariance to these
transformations.

Keypoint descriptor: The local image gradients are evaluated at each selected
scale in the region around key points. These are transformed into a description that
allows for significant levels of local shape distortion and change in illumination.

This approach has been named the Scale Invariant Feature Transform (SIFT),
as it transforms image data into scale-invariant coordinates relative to local features.

Department of ECE, PESIT-BSC Page 2


SIFT:SCALE INVARIANT FEATURE TRANSFORM

CHAPTER2

RELATED RESEARCH

The development in image matching by using a set of local interest points can be
traced back to the work of Moravec (1981) on stereo matching using corner detector.
The detector was enhanced by Harris and Stephens (1988) to make it more precise
under small image variations. Harris also achieved efficient motion tracking and 3D
structure from motion recovery (Harris, 1992), and the Harris corner detector has
since been widely used for image matching process. These detectors are usually
called corner detectors, but they are not selecting corners, rather any image location
that has large gradients in all directions at a predetermined scale.

The initial applications were restricted to stereo and motion tracking, but later the
approach was extended further. Zhang et al. (1995) showed that it is possible to
match Harris corners over a large image range by using a correlation window around
each corner to select likely matches. Outliers were then removed by solving for a
fundamental matrix describing the geometric constraints between the two views of
rigid scene and removing matches that did not agree with the majority solution
. Impressive work on extending local features to be invariant to full affine

transformations (Baumberg, 2000; Tuytelaars and Van Gool, 2000; Mikolajczyk and
Schmid, 2002; Schaffalitzky and Zisserman, 2002; Brown and Lowe, 2002). This
allows mathing of invariant features on a planar surface under changes in
orthographic 3D projection, in most cases by resampling the image. However, none
of these approaches are yet fully affine invariant, as they start with initial feature
scales and locations selected in a non-affine-invariant manner due to the prohibitive
cost of exploring the full affine space.

Department of ECE, PESIT-BSC Page 3


SIFT:SCALE INVARIANT FEATURE TRANSFORM

CHAPTER3

DETECTION OF SCALE SPACE

As discussed in the introduction, we will be detecting keypoints using a cascade


filtering approach that uses efficient algorithms to identify candidate locations that
are then examined further. The first stage of keypoint detection is to identify
locations and scales that can be repeatably assigned under differing views of the
same object. Detecting locations that are invariant to scale change of the image can
be accomplished by searching for stable features across all possible scales, using a
continuous function of scale known as scale space.

Under a variety of reasonable assumptions the only possible scale-space kernel is the
Gaussian function. Therefore, the scale space of an image is defined as a function,
L(x, y, σ), that is produced from the convolution of a variable-scale
Gaussian,G(x, y, σ), with an input image, I(x, y):

L(x, y, σ) = G(x, y, σ) ∗ I(x, y)

where * is the convolution operation in x and y, and

1
G(x, y, σ) = ∗ exp(−(x2 + y 2 )/2σ 2 ) (1)
2πσ 2

To efficiently detect stable keypoint locations in scale space, we will be using


scale-space extrema in the difference-of-Gaussian function(DOG) convolved with
the image, D(x, y, σ) which can be computed from the difference of two nearby
scales separated by a constant multiplicative factor k:

D(x, y, σ) = (G(x, y, kσ) − G(x, y, σ)) ∗ I(x, y)

Department of ECE, PESIT-BSC Page 4


SIFT:SCALE INVARIANT FEATURE TRANSFORM

= L(x, y, kσ) − L(x, y, σ)

There are many reasons that we have choosen this function for calculating DOG .
First, it is a particularly desirable function for computing , as the smoothed images,
L, need to be computed in for scale space feature description, and therefore D can
be calculated using simple image subtraction.

fig 3.1. Difference Of Gaussian

Difference of Gaussian is also used as a approximation for normalized Laplacian of


Gaussian(LOG), σ 2 ∇2 G .Normalization of the LOG with a factor of σ 2 is required
for normalization. After observing the behaviour of LOG we found out that most
stable keypoints are found at maxima or minima of σ 2 ∇2 G compared to other image
functions like Gradient,Hessian or Harris detector

The relation between Difference of Gaussian and LOG can be understood by using
heat equation as:

∂G
∂σ
= σ 2 ∇2 G

from the above equation we can see that LOG can be calculated using finite
∂G
difference approximation of ∂σ
using difference of nearby scale at Kσand σ

∂G G(x,y,kσ)−G(x,y,σ)
∂σ
≈ kσ−σ

therefore,

G(x, y, kσ) − G(x, y, σ) ≈ (K − 1)σ 2 ∇2 G

Department of ECE, PESIT-BSC Page 5


SIFT:SCALE INVARIANT FEATURE TRANSFORM

The above equation shows that the DOG have scale differing by a constant factor as
it already consist of σ 2 required for scale invariant transformation. The factor
(K − 1) is constant over all extrema. Any errors regarding approximation will go if
K = 1, but practically we have seen that there is no effect on stability of exterma

detection or localization for even different scales,such as K = 2.

An approach for construction of D(x, y, σ)is shown in fig 3.1.Initially we are


convolving the input image incrementally with Gaussian to produce images of
different scale space varying by factor K, shown in left side of fig 3.1.Then divide
1
each octave scale by into an integer number, s, sok = 2 s . we should produce (s+3)
images in stack of blurred image so that extrema detection covers complete octave.

Once a complete octave is generated then the input image is resampled by reducing
its resolution by half by doubling the initial value of The accuracy of sampling
relative to than that of the previous computation of octave, in generally the
compuration are reduced from previus stage.

Department of ECE, PESIT-BSC Page 6


SIFT:SCALE INVARIANT FEATURE TRANSFORM

3.1 Local Extrema Detection

To detect the local maxima and minima of D(x, y, σ) , each sample point is
compared to its eight neighbors in the current image and nine neighbors in the scale
above and below as shown in fig 3.2. The sample point is selected only if it is larger
than all of these neighbors or smaller than allof them. This methof reduces the cost
cost reasonably low due to the fact that most sample points will be eliminated
following within few checks

Fig 3.2 Detection of maxima and minima of DOG

An important issue is to figure out at what frequency we should sample the image
and scale domains to reliably detect the extrema. Unfortunately, there is no
minimum spacing of samples that will detect all extrema, as the extrema can be
arbitrarilyclose together. This can be seen by considering a white circle on a black
background, which will have a single scale space maximum where the circular
positive central region ofthe difference-of-Gaussian function matches the size and
location of the circle.

Department of ECE, PESIT-BSC Page 7


SIFT:SCALE INVARIANT FEATURE TRANSFORM

3.2 Sampling Frequency in scale

Sampling the image increases the stability of extrema that can be seen by fig 3.3
and 3.4.These figures are based on a matching task using a collection of 32 images
having different order of outdoor scenes, human faces, aerial photographs, and
industrial images, different contrast. Then each image was subject to a range of
transformations, including image rotation, scaling, affine transform, change in
brightness and contrast, and addition of noise. Because the changes were normal, it
was possible to predict where each feature in original image should appear in the
transformed image, allowing for measurement of correct repeatability and positional
accuracy for each feature.

Fig 3.3 Scales per Octave

Fig 3.3 shows the simulation results used to verify the effect of different number of
scales per octave at which the image function is sampled preceding to extrema
detection. In this case, each image was resampled following rotation by some
random angle and scaling randomly in between 0.2 of 0.9 times the original image
size. Key points from the reduced resolution image were matched against those from
the original image.

The top line in fig 3.3 shows the percentage of key points that detected the
matching location and scale in the transformed image. Here , we define matching

scale as approximate within a factor of 2. The correct scale, and a matching
location as being within σ pixels, where is the scale of the key point . The lower
line on this graph shows the number of key points that are correctly matched to a
database of 40,000 key points using the nearest-neighbor matching (this shows that
once a key point is repeatability located, it is useful to use it for recognition and

Department of ECE, PESIT-BSC Page 8


SIFT:SCALE INVARIANT FEATURE TRANSFORM

matching tasks). As fig3.3 shows, the highest repeatability is obtained when


sampling 3 scales per octave, and this is the number of scale we have used for
determination of features.

Fig 3.4 Number of scales for calculating DOG

It is surprising that as we increase the number of scale per octave the repeatability
doesn’t improve. This is due to the fact that as we increase the number of scales
many more local extrema will be detected which are unstable and have low contrast,
as shown in fig 3.4 This figure shows the average number key points correctly
detected in each image.

The number of keypoints increases with increased sampling of scales and the total
number of correct matches also rises. Since the success of object recognition often
depends more on the quantity of correctly matched keypoints, as opposed to their
percentage correct matching, for many applications it will be optimal to use a larger
number of scale samples. However, the cost of computation rises with increase of
scale, so we should preferably choose to use just 3 scale samples per octave.

To summarize, these experimentshave shown us that scale-space


difference-of-Gaussian function have many number of extrema and it would be
expensive to detect them all. Fortunately, we can detect the most stable subset even
with a coarse sampling of scales.

Department of ECE, PESIT-BSC Page 9


SIFT:SCALE INVARIANT FEATURE TRANSFORM

3.3 Sampling in spatial domain

As in the previous section we determined the sampling frequency required per


octave of scale space, now we should determine the frequency of sampling in the
image domain relative to the scale of smoothing(σ). As the extrema can be
arbitrarily really close together, there is a trade-off between sampling frequency and
rate of detection. Figure 3.5 shows an experimental determination of the amount of
prior smoothing, σ, that is applied to each image initially. The top line is the
repeatability of keypoint detection, and the results show that the repeatability
continuously increases with σ . The cost increases with σ in terms of efficiency, so
we have choose to use σ= 1.6, which provides close to optimal repeatability. This
value is used for calculating keypoints and results is shown in Fig 3.5

Fig 3.5 Sampling in Spatial domain

Department of ECE, PESIT-BSC Page 10


SIFT:SCALE INVARIANT FEATURE TRANSFORM

CHAPTER4

ACCURATE KEYPOINT LOCALIZATION

4.1 Initial Outlier Rejection

After a keypoint candidate has been detected by comparing a pixel to its neighbors,
then we perform a detailed fit to the nearby data for location, scale, and ratio of
principal curvatures. This also helps us to allows points to be rejected that have low
contrast (and are therefore sensitive to noise) or poorly localized along an edge.

Initially we simply locate keypoints at the location and scale of the central sample
point. However, Brown developed a method for fitting a 3D quadratic function to
the local sample points to determine the location of the maximum and minimum
,this turns out to have improved image matching and stability. This approach uses
Taylor expansion (up to the quadratic terms) of the scale-space function, D(x, y, σ),
shifted so that the origin is at the sample point:

∂DT 2
D(x) = D + ∂x
+ 21 xT ∂∂xDx
2 x (1)

where D and its derivatives are evaluated at the sample point and x = (x, y, σ)T is
the offset from this point. The location of the extremum, x̂ , is calculated by taking
the derivative of this function with respect to x and equating it to zero, which gives

∂ 2 D− 1 ∂d
x̂ = ∂x2
∗ ∂x
(2)

Derivative of D are estimated by calculating differences of neighboring sample points.


The resulting 3x3 linear system can be solved with minimum cost. If the offset x̂ is

Department of ECE, PESIT-BSC Page 11


SIFT:SCALE INVARIANT FEATURE TRANSFORM

larger than 0.5 , then it means that the extremum lies closer to a different sample
point. In this situation, the sample point is changed and then the interpolation is
performed about that point. The final offset x̂ is added to the location of its sample
point to get the interpolated estimate for the location of the extremum.

Fig 4.1 Key point Localization

The function value at the extremum, D(x̂), is useful for rejecting unstable keypoint
with low contrast. This can be obtained by substituting equation (2) into (1), giving

1 ∂DT
D(x̂) = D + 2 ∂ x̂

To remove the low contrast keypoints , all extrema with a value of —D(x )— less
than 0.03 were discarded. Figure 5 shows the effects of keypoint selection on a
natural image. In order to avoid too much clutter, a low-resolution 233 by 189 pixel
image is used and keypoints are shown as vectors giving the location, scale, and
orientation of each keypoint (orientation assignment is described below)

Fig 4.1 shows the original image, which is shown at reduced contrast behind the
subsequent figures. Fig 4.2 shows the 832 keypoints at all detected maxima and
minima of the difference-of-Gaussian function, while Fig 4.3 shows the 729 keypoints
that remain following removal of those with a value of —D(x )— less than 0.03. Fig
4.4 will be explained in the next section.

Department of ECE, PESIT-BSC Page 12


SIFT:SCALE INVARIANT FEATURE TRANSFORM

4.2 Further Outlier Rejection

In the previous section we rejected low contrast extrema from image, but it is not
sufficient for stability. The DOG has a strong response along edges, even if the
location around each edge is unstable and therefore makes small amount of noise

DOG function with a poor peak will have a large principal curvature across the edge
but a small one in the perpendicular direction. Therefore principal curvatures is
computed from a 2x2 Hessian matrix, H, computed at the location and scale of the
keypoint:

 
Dxx Dxy 
H= 
Dxy Dyy

The derivatives are estimated by taking differences of neighbouring sample points.

The eigenvalues of H are proportional to the principal curvatures of D. we can avoid


explicitly computing the eigenvalues, as we are only concerned with their ratio. Let
α be the eigenvalue with the largest magnitude and β be the smaller one. Then, we
can compute the sum of the eigenvalues from the trace of H and their product from
the determinant:

T r(H) = Dxx + Dyy = α + β

Det(H) = Dxx ∗ Dyy − (Dxy)2 = α ∗ β

If the determinant is negative, the curvatures have different signs so the point is not
taken as an extremum. Let r be the ratio between the largest magnitude eigenvalue
and the smaller one, so that α = rβ. Then,

Department of ECE, PESIT-BSC Page 13


SIFT:SCALE INVARIANT FEATURE TRANSFORM

T r(H)2 (α+β)2 rβ+β (r+1)2


Det(H)
= αβ
= rβ 2
= r
,

Here the ratio only depends on the eigenvalues rather than their individual values.
The quantity(r + 1)2 /r is minimum when the two eigenvalues αandβ are equal and
the ratio increases with r. Therefore, to check that the ratio of principal curvatures
is below some threshold, r, we only need to check if

T r(H)2 (r+1)2
Det(H)
< r
,

The above algorithm is very efficient as it compute, with less than 20 floating point
operations required to test each keypoint. we eliminates keypoints that have a ratio
between the principal curvatures greater than 10. The transition from Fig4.1(c) to
4.1(d) shows the effects of this operation.

Department of ECE, PESIT-BSC Page 14


SIFT:SCALE INVARIANT FEATURE TRANSFORM

CHAPTER5
ORIENTATION ASSIGNMENT

Orientation assignment is done in order to achieve rotation invariance . The scale of


the keypoint is used to select the Gaussian smoothed image, L, with the closest
scale, so that computations are executed in a scale-invariant manner.For each image
sample, L(x, y) , the gradient magnitude, m(x , y), and orientation,θ(x, y), is
precomputed using pixel differences:

p
m(x, y) = (L(x + 1, y) − L(x − 1, y))2 + (L(x, y + 1) − L(x, y − 1))2

θ(x, y) = tan− 1((L(x, y + 1) − L(x, y − 1))/(L(x + 1, y) − L(x − 1, y)))

After computations orientation histogram is formed using the gradient orientations


of sample points around the key point. The histogram has 36 bins covering the 360
degree range of orientations. Each sample added to the histogram is weighted by its
gradient magnitude and by a Gaussian-weighted circular window with a σ that is 1.5
times that of the scale of the keypoint.

Fig 5.1 Orientation Histogram

Peaks are generated in histogram which corresponds to the dominant directions of


local gradients. The highest peak in histogram is detected, and if there is any local
peak that is in the range of 80 percentage of the highest peak is also used to create a
keypoint with that orientation. Therefore, locations with multiple peaks with
similar magnitude will have multiple key points created at the same location and
scale but with different orientations. Only about 15 percenatge of points are
assigned multiple orientations,these contribute significantly to the stability of
matching key points. Finally, a parabola is fit to the 3 histogram values closest to
each peak to interpolate the peak position for better accuracy.

Department of ECE, PESIT-BSC Page 15


SIFT:SCALE INVARIANT FEATURE TRANSFORM

CHAPTER6

LOCAL IMAGE DESCRIPTOR

In previous section we have assigned an image location, scale, and orientation to


each keypoint. These parameters create a repeatable local 2D coordinate system
which describe the local image region, and therefore provide invariance to these
parameters. Now in this section we compute a descriptor for the local image region
that is highly distinctive yet is as invariant as possible to remaining variations, such
as change in illumination or 3D viewpoint.

One of the approach would be to sample the local image intensities around the
keypoint at the appropriate scale, and to match these using a normalized correlation
measure. However, simple correlation of image patches is highly sensitive to changes
that cause misregistration of samples, such as affine or 3D viewpoint change or
non-rigid deformations.

Department of ECE, PESIT-BSC Page 16


SIFT:SCALE INVARIANT FEATURE TRANSFORM

6.1 Extraction Of Local Image Descriptor at Key Points

Fig 6.1 illustrates the computation of the key point descriptor. First the image
gradient magnitudes and orientations are sampled around the key point location,
using the scale of the key point to select the level of Gaussian blur. In order to
achieve orientation invariance, the coordinates of the descriptor and the gradient
orientations are rotated relative to the key point orientation. For efficiency, the
gradients are pre computed for all levels of the pyramid as described in chapter 5.
These are illustrated with small arrows at each sample location on the left side of
Fig 6.1.

Fig 6.1 key point descriptor

A Gaussian weighting function with equal to one half the width of the descriptor
window is used to assign a weight to the magnitude of each sample point. This is
illustrated with a circular window on the left side of Figure 6.1, although, the weight
falls off smoothly. The purpose of this Gaussian window is to avoid sudden changes
in the descriptor with small changes in the position of the window, and to give less
emphasis to gradients that are far from the center of the descriptor, as these are
most affected by errors

Department of ECE, PESIT-BSC Page 17


SIFT:SCALE INVARIANT FEATURE TRANSFORM

The key point descriptor is shown on the right side of Figure 6.1. It allows for
significant shift in gradient positions by creating orientation histograms over 4x4
sample regions. The figure shows eight directions for each orientation histogram,
with the length of each arrow corresponding to the magnitude of that histogram
entry. A gradient sample on the left can shift up to 4 sample positions while still
contributing to the same histogram on the right, thereby achieving the objective of
allowing for larger local positional shifts

The descriptor is formed from a vector containing the values of all the orientation
histogram entries, corresponding to the lengths of the arrows on the right side of
Figure 7. The figure shows a 2x2 array of orientation histograms, whereas our
experiments below show that the best results are achieved with a 4x4 array of
histograms with 8 orientation bins in each. Therefore, the experiments in this paper
use a 4x4x8 = 128 element feature vector for each key point.

6.1.1 Remove illumination effect


To reduce the effects of illumination we need to normalize the feature vector to unit
length. A change in image contrast in which each pixel value is multiplied by a
constant will multiply gradients by the same constant, so this contrast change will
be canceled by vector normalization. A brightness change in which a constant is
added to each image pixel will not affect the gradient values, as they are computed
from pixel differences. Therefore, the descriptor is invariant to affine changes in
illumination

However, non-linear illumination changes can also occur due to camera saturation or
due to illumination changes that affect 3D surfaces with differing orientations by
different amounts. These effects can cause a large change in relative magnitudes for
some gradients, but are less likely to affect the gradient orientations. Therefore, we
reduce the influence of large gradient magnitudes by thresholding the values in the
unit feature vector to each be no larger than 0.2, and then renormalizing to unit
length. the distribution of orientations has greater emphasis.

Department of ECE, PESIT-BSC Page 18


SIFT:SCALE INVARIANT FEATURE TRANSFORM

6.2 Descriptor Testing


Two parameters can be used to vary the complexity of the descriptor: the number of
orientations, r, in the histograms, and the width, n, of the nxn array of orientation
histograms. The size of the resulting descriptor vector is rn2 . As the complexity
increases, it will be able to discriminate better in a large database, but it will also
be more sensitive to shape distortions and occlusion.

Fig 6.2 Optimum width for Descriptor

Figure 6.2 shows experimental results in which the number of orientations and size
of the descriptor were varied. The graph was generated for a viewpoint
transformation in which a planar surface is tilted by 50 degrees away from the
viewer and 4 per of noise.

The results continue to improve up to a 4x4 array of histograms with 8 orientations.


After that, adding more orientations or a larger descriptor can actually hurt
matching by making the descriptor more sensitive to distortion.

Department of ECE, PESIT-BSC Page 19


SIFT:SCALE INVARIANT FEATURE TRANSFORM

CHAPTER7

CONCLUSION

The SIFT keypoints are particularly important due to their distinctiveness, which
enables them to get correct match for a keypoint which is selected from a large
database of other keypoints. The distinctiveness of keypoints are achieved by
representing the image gradients with high-dimensional vector within a local region
of the image. The keypoints are shown to be invariant to image rotation , scale ,
affine distortion, addition of noise, and change in illumination. Large numbers of
keypoints can be extracted from images, which leads to robustness in extracting
small objects among noise.Keypoints are detected over a complete range of scales
determines that the small local features are available for matching small , while
large keypoints perform well for images subject to noise ,blur and clutter. The
computation is efficient, as several thousand keypoints can be extracted from a
typical image with near real-time performance on standard PC hardware.

Department of ECE, PESIT-BSC Page 20


SIFT:SCALE INVARIANT FEATURE TRANSFORM

REFERENCES

[1]Witkin, A.P. 1983. Scale-space filtering. In International Joint Conference on


Artificial Intelligence, Karlsruhe, Germany, pp. 1019-1022.
[2] Mikolajczyk, K., and Schmid, C. 2002. An affine invariant interest point
detector. In European Conference on Computer Vision (ECCV), Copenhagen,
Denmark, pp. 128-142.
[3]Lindeberg, T. 1993. Detecting salient blob-like image structures and their scales
with a scale-space primal sketch: a method for focus-of-attention. International
Journal of Computer Vision, 11(3): 283-318.
[4]Lowe, D.G. 2001. Local feature view clustering for 3D object recognition. IEEE
Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, pp.
682-688.
[5]Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In
Fourth Alvey Vision Conference, Manchester, UK, pp. 147-151.
[6]Brown, M. and Lowe, D.G. 2002. Invariant features from interest point groups. In
British Machine Vision Conference, Cardiff, Wales, pp. 656-665.
[7]https://www.youtube.com/watch?v=NPcMS49V5hg

Department of ECE, PESIT-BSC Page 21

You might also like