0% found this document useful (0 votes)

30 views

Classical Computer Vision - Session 2

Uploaded by

tahatarek7770

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Classical Computer Vision - Session 2

Uploaded by

tahatarek7770

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 74

TEXTURE FEATURES

Texture features are repeated patterns of local variation in image intensity. It gives us
information about the spatial arrangement of color or intensities in an image. In other
words texture analysis attempts to quantify intuitive qualities described by terms such
as rough, smooth, silky, or bumpy as a function of the spatial variation in pixel
intensities. This will be done using binary codes as we will take on the next slides.
Note: Texture feature can’t be deﬁned for a point (pixel).
Images can have same intensity distribution with different textures as shown below:
TEXTURE FEATURES
Applications of texture analysis
● Segment image into regions with same texture. (Image segmentation)
● Object recognition based on their textures.
● Edge detection can be done by the change in the texture (The parts where the
texture changes are mostly edges.

Texture basic element is called texels. The texture generally is found in 2 parts:
● Tone which is the pixel intensity
● Structure which the spatial relationship between the texels.
LOCAL BINARY PATTERNS (LBP)
Local binary patterns (LBPs) are texture descriptors made in 2002 that work locally on
parts of images. This local representation is constructed by comparing each pixel with
its surrounding neighborhood of pixels.
Steps of constructing LBP descriptor
● Convert image to grayscale images.
● Loop over each pixel such that we compare it with its 8 neighbours (28 possibility).
○ If neighbour has less value then put 1.
○ If neighbour has higher value then put 0.
LOCAL BINARY PATTERNS (LBP)
● Convert the LPB code you got to a decimal value in counter clockwise manner.

● Change value of output image at this pixel location to 23 and keep looping on
other pixels and do the same previous steps.
LOCAL BINARY PATTERNS (LBP)
The output will be something like this which is not very intuitive to us but make sense
for computer vision tasks like segmentation..
LOCAL BINARY PATTERNS (LBP)
The problem of this original LBP descriptor is that it can’t sense the details at varying
scales (it can only sense the variability in 3x3 neighbourhood), two parameters will be
introduced to account for variable neighbourhood sizes:
● Number of points (P) in circularly neighbourhood to consider.
● Radius (R) which gives us the availability to deal with different scales.

How we get the values of g1, g3, g5, and g7 points at figure (a) ? Bilinear interpolation !!
LOCAL BINARY PATTERNS (LBP)
Bilinear interpolation is a popular method for two-dimensional interpolation on a
rectangle. That is, we assume that we know the values of some unknown function at
four points that form a rectangle. Using bilinear interpolation, we can estimate this
function's value at any point (x, y) inside this rectangle. We will denote this unknown
value by P.
INTEREST POINTS
Look at this red box and ask yourself, is this part can be assumed as interest point? Is it
distinguishable and seems unique?
INTEREST POINTS (FOR IMAGE MATCHING)
INTEREST POINTS
To find points that can tell me that this image is same as this one, we need to find
features that are unique and distinguishable where we can decide whether the 2 images
are the same or not based on them. These are the interest points which are local
features (subparts of image) which are invariant to transformation as shown below:
INTEREST POINTS
One of the most used interest points that gives uniqueness and distinguishability are
corners. WHY?
INTEREST POINTS
What makes corner a great interest point that we can match or distinguish based on it?
INTEREST POINTS
CORNER DETECTION (HARRIS)
How is the discovery of corners done mathematically?

(u, v) is the shifting in x & y directions.

The equation is actually measuring the sum of squared differences between the
intensity of the pixels after shifting and before shifting where w(x, y) is a rectangular
(gives 0s weights to all pixels outside the window and 1s to all pixels in the window we
test) or Gaussian (Give higher weights to the center of the window and lower weights
towards the window sides)

In your opinion, what output of E(u, v) makes it corner or not?

CORNER DETECTION (HARRIS)
Since I(x+u, y+v) - I(x, y) is the measurement of quantity of variation between the original
patch(window) and after shifting the window hence as the value of E(u, v) increases, this
means more variation which means more cornerness. If E(u, v) is 0 this means that the
shift of the window doesn’t affect the values intensities which means it is not a corner
but a ﬂat surface or moving along an edge as we said before.
CORNER DETECTION (HARRIS)
Using Taylor’s series of expansion on I(x+u, y+v), we reach the following:

If the shift (u, v) is small, then we can approximate it to: (remove higher order terms)

Note: Ix means the differentiation of I in the x-direction which is actually the vertical
edges. It can be computed using sobel ﬁlter easily.
CORNER DETECTION (HARRIS)
This leads to the following equation of E(u, v):
CORNER DETECTION (HARRIS)
CORNER DETECTION (HARRIS)
Finally we can reach our ﬁnal equation:

Now what we want is to know which directions makes the largest or smallest values of
E(u, v)? Eigenvalues can do this for us. λ1 and λ2 will be the 2 eigenvalues of M.
● If λ1 and λ2 are small, hence the E(u, v) will be small ⇒ Flat surface.
● If λ1 >> λ2 or vice versa hence change is in one direction ⇒ Edge.
● If λ1 and λ2 are large and close to each other (λ1 ~ λ2) which means E(u, v) is high in
all direction ⇒ Corner
CORNER DETECTION (HARRIS)
CORNER DETECTION (HARRIS)
Instead of computing the Eigenvalues explicitly, we can use the following equation to
get corner strength (R):

Notes:
● trace(matrix) =
● k value is empirically between 0.04 and 0.06.
● det(M) = λ1 . λ2
● trace(M) = λ1 + λ2
How to say whether it is a corner, edge or flat?
● If R is above threshold hence corner (interest point)
● If R is negative hence edge (contour)
● If R is small around 0 hence flat (uniform)
CORNER DETECTION (HARRIS)
CORNER DETECTION (HARRIS)
R-Value where red means high and blue means low
CORNER DETECTION (HARRIS)
Thresholding if R > threshold value
CORNER DETECTION (HARRIS)
CORNER DETECTION (HARRIS)
Now we can say that Harris detector can get the interest points on an image where we
can use them to define pixels or parts that are unique enough to identify an image.

The output of Harris detector is rotation invariance so R is invariant to image rotation as

the ellipse rotates but its shape (eigenvalues) remains the same.
CORNER DETECTION (HARRIS)
Although Harris detector is rotation invariant however it is not scale invariant. TO make
it scale invariant, we need to discuss Blob detection but it is not included in this course.
HAAR CASCADES

HAAR Cascades is an algorithm for object detection regardless of objects’ location or

scale in image. it can run in real-time which make it possible to run on video streams.
The original paper focused its research on faces objects however it can be run on any
type of objects needed to be detected.
The algorithm can summarized in four stages:
● Calculate Haar features
● Create integral images
● Apply AdaBoost
● Implement attention cascade classiﬁers

Let us now dive deeper to understand each stage assuming that we will be detecting
faces as the original images in our example.
HAAR CASCADES

Step1: Calculate Haar features

Before going in calculations, you need to know that Haar cascade is like any machine
learning algorithm needs positive and negative samples of the object you train on. It is
a machine learning-based algorithm as it adopts the idea of training as well.
Haar features are calculations that are performed on adjacent rectangular regions at
speciﬁc location in the detection window.
Haar Features are similar to the kernels that we have gone through before and what we
will go through in CNNs (the difference with the CNNs is that it is not trainable but
manually determined).

In Haar cascades, we have about 180,000 Haar features used to ﬁnd the suitable
features representing the objects we search for. But ﬁrstly what is Haar features look
like?
HAAR CASCADES

Haar features are broadly classified into three categories. The first set of two rectangle
features are responsible for finding out the edges in a horizontal or in a vertical
direction (as shown above). The second set of three rectangle features are responsible
for finding out if there is a lighter region surrounded by darker regions on either side or
vice-versa. The third set of four rectangle features are responsible for finding out change
of pixel intensities across diagonals. These are scaled at different scales and aspect
ratios to get our 180,000 Haar features.
HAAR CASCADES
HAAR CASCADES

These Haar features are mainly applied in iterative manner as sliding window on the
image:
HAAR CASCADES

What actually happens is that When haar features are applied to image as shown in the
previous slide, each feature results in a single value which is calculated by subtracting
the average of pixels under white rectangle from the average of pixels under black
rectangle.
HAAR CASCADES

The objective from this step is to ﬁnd out if the image has an edge separating dark
pixels on the right and light pixels on the left or not. Hence we say that there is an edge
detected if the haar value is closer to 1. In the example above, there is no edge as the
haar value is far from 1 (-0.02).
This is just one representation of a particular haar feature separating a vertical edge.
Now there are other haar features as well, which will detect edges in other directions
and any other image structures. To detect an edge anywhere in the image, the haar
feature needs to traverse the whole image.
Now, the haar features looping on an image would involve a lot of mathematical
calculations. As we can see for a single rectangle on either side, it involves 18 pixel value
additions (for a rectangle enclosing 18 pixels). Imagine doing this for the whole image
with all sizes (to be explained later) of the haar features. This would be a hectic
operation even for a high performance machine. This would be the motivation of the
second step.
HAAR CASCADES

Step2: Create integral images

The integral image is an image created from the original image where each pixel is
actually the summation of all pixels above and to the left of the current pixel.
HAAR CASCADES

Can you guess what is the gain of using integral images?

With the Integral Image, only maximum of 4 constant value operations are needed each
time for any feature size (with respect to the 18 additions earlier).
Operation = Bottom right value - value to left of bottom left value - value above top
right value + value to the top left of the top left value
HAAR CASCADES

Step3: Apply AdaBoost

We have said before that we use 180,000 haar features to find the object which is too
much to be used that is why Adaboost came to reduce the 180,000 to the top 6000 (as
proposed in the paper but can be any relevant number to your task) Haar features that
best distinguish the object from the non-object. The 180,000 haar features are called
weak learners as we have said before in the Adaboost (Revise tree-based algorithms in
ML) where from these weak learner we will try to figure out strong learners
How learning takes place in Adaboost?
● Given positive and negative image examples at the object (cropped faces at our
example)
● Loop by each of the 180,000 haar features we have on all our positive and negative
images and for each image we classify whether it is a face or not
● Calculate our error rate for each haar feature by computing the number of
misclassified faces over the total number of images.
HAAR CASCADES

● Choose the haar feature which has the lowest error rate and get it out of your haar
features (remaining haar features now is (180,000-1).
● Now update the weights of each misclassified image sample higher to stress on
correcting them in the next iteration.
● Loop again with all remaining haar features on all images once again and repeat
the error computation and choose the next haar feature to retain until reaching
the number of haar features needed or reach the accuracy you are seeking for.
Note:
● Don’t forget that in Adaboost, weights will be given to each predictor (haar
feature) based on its erroneous (performance) so better haar features will have
higher weights in the inference.
● Weights of data samples are different from final weights of predictors as weights
of data samples affect the next decision tree while weights of predictors affect the
final output in the inference stage.
HAAR CASCADES

Step4: Implement attention cascade classiﬁers

The subset of 6000 features will again run on the inference stage to detect if there’s a
facial feature present or not. The authors have taken a standard window size of 24x24
within which the feature detection will be running but this is a very computational and
time consuming task and that is why the attention cascade classiﬁer comes.
The idea behind this is, not all the features need to run on each and every window. If a
feature fails on a particular window, then we can say that the facial features are not
present there. Hence, we can move to the next windows where there can be facial
features present.
HAAR CASCADES

Haar features are applied on the images in stages in the following manner:
● The stages in the beginning contain simpler features, in comparison to the
features in a later stage which are complex, complex enough to ﬁnd the nitty gritty
details on the face. If the initial stage won’t detect anything on the window, then
discard the window itself from the remaining process, and move on to the next
window. This way a lot of processing time will be saved, as the irrelevant windows
will not be processed in the majority of the stages.
● The second stage processing would start, only when the features in the ﬁrst stage
are detected in the image. The process continues like this, i.e. if one stage passes,
the window is passed onto the next stage, if it fails then the window is discarded.

Check the visualization in the next slide.

HAAR CASCADES

This is a simple visualization, however in reality there is much stages than that and
much features in each stage.
HAAR CASCADES

In the paper, the author proposed a total of 38 stages for something around 6000
features. The number of features in the first five stages are 1, 10, 25, 25, and 50, and this
increased in the subsequent stages.
The initial stages with simpler and lesser number of features removed most of the
windows not having any facial features, thereby reducing the false negative ratio,
whereas the later stages with complex and more number of features can focus on
reducing the error detection rate, hence achieving low false positive ratio.
HAAR CASCADES
FEATURE DESCRIPTOR
Feature descriptor is a representation of an image that simplifies the image by
extracting only the most useful information and throw away all unneeded information.
Let us take an example and say:
Can you tell me what you see in the two images below?
FEATURE DESCRIPTOR
Let us make it slightly harder, Now Can you tell me what you see in the two images
below?
FEATURE DESCRIPTOR
What is the difference between first pair and second pair of images?
The first pair carries much more information like colors, shapes, background, etc.., while
the second pair only carries the edges, corners and shapes. Although the information in
second pair is less, we were able to say what is the object in the image because it
carries the most important information which is sufficient to recognize the object. This
what feature descriptors are made to do.
Famous feature descriptors:
● HoG (Histogram of Gradients)
● SIFT (Scale Invariant Feature Transform)
● SURF (Speeded-Up Robust Feature)
SURF is not included in the course but it is a successor of SIFT that uses integral images
in convolution with box filters to fast up the computation.
Let us now start with the Histogram of Gradients (HoG).
HISTOGRAM OF GRADIENTS (HOG)
HoG is is a feature descriptor that mainly relies on the idea of gradients (magnitudes
and directions) to detect an object in an image. It focuses on representing edges in a
better way as we will show now.
Steps:
● Preprocess the image.
● Calculate the gradients.
● Calculate the magnitude and the orientation of gradients.
● Calculate the histogram of the gradients in nxn cells.
● Normalize gradients in 2n x 2n cells.
● Generate the features for the whole image.

Let us now take it step by step and understand it.

HISTOGRAM OF GRADIENTS (HOG)
Step1: Preprocess the image.
The preprocessing needed here is resizing the image to 1:2 or 2:1 ratio between the width
and the height based on the object you are searching for if it is landscape or portrait.
The image size would be 64x128 (this the size used in their papers) which make it easier
in calculations in the following steps to grid our image to 8x8 or 16x16 grids.
HISTOGRAM OF GRADIENTS (HOG)
Step2: Calculate the gradients.
Calculate the gradient of each pixel in the image in both x-direction and y-direction

Gradient in x-direction at red pixel is (Gx) = 89-78 = 11

Gradient in y-direction at red pixel is (Gy) = 68-56 = 8
HISTOGRAM OF GRADIENTS (HOG)
This process will give us two new matrices – one storing gradients in the x-direction and
the other storing gradients in the y direction. The magnitude would be higher when
there is a sharp change in intensity, such as around the edges.
Can you tell us what this step looks like from what we have taken?
This is similar to using a Sobel or Prewitt Kernel of with size 1x3 or 3x1.

We have calculated the gradients in both x and y direction separately. The same process
is repeated for all the pixels in the image.
HISTOGRAM OF GRADIENTS (HOG)
Step3: Calculate the magnitude and the orientation
To calculate the magnitude and gradient at each pixel, we will use Pythagoras theorem:
Gradient Magnitude = √(Gx2 + Gy2) ⇒ √(112 + 82) = 13.6
Gradient Orientation = tan-1 (Gy / Gx) ⇒ tan-1 (8/11) = 36o
Step4: Calculate the histogram of the gradients in nxn cells (in the example below 5x5)
The simplest method for generating histogram is just counting the occurrence of each
value as shown below:
HISTOGRAM OF GRADIENTS (HOG)
The process is repeated for all the pixels orientation and magnitudes noting that here
the bin value of the histogram is 1. Hence we get about 180 different buckets, each
representing an orientation value. Another method is to create the histogram features
for higher bin values. By using a bin value of 20, we get 9 buckets only as shown below:

This gives us a 9x1 matrix instead of the 180x1 we got in the previous one.
HISTOGRAM OF GRADIENTS (HOG)
As we can notice, the only value that is taken into consideration is orientation, where is
the magnitude value contribution? We will make what we call weighted histogram.
Note: the higher contribution should be to the bin value which is closer to the
orientation.
HISTOGRAM OF GRADIENTS (HOG)
The histograms created in the HOG feature descriptor are not generated for the whole
image. Instead, the image is divided into 8×8 cells, and the histogram of oriented
gradients is computed for each cell. Why do you think this happens?
By doing so, we get the features (or histogram) for the smaller patches which in turn
represent the whole image. We can certainly change this value here from 8 x 8 to 16 x 16
or 32 x 32.
If we divide the image into 8×8 cells and generate the histograms,
we will get a 9 x 1 matrix for each grid (8x8 cell). The histograms are
weighted histograms as we have shown in the previous slide.
HISTOGRAM OF GRADIENTS (HOG)
Step5: Normalize gradients in 2n x 2n cells
If the grid was 8x8, hence we normalize the gradients on 16x16 cells.
Why we do such step? The gradients of the image are sensitive to the overall lighting.
This means that for a particular picture, some portion of the image would be very bright
as compared to the other portions. We cannot completely eliminate this from the image.
But we can reduce this lighting variation by normalizing the gradients by taking 16×16
blocks.
HISTOGRAM OF GRADIENTS (HOG)
How we normalize a vector of numbers?
Each 8×8 cell has a 9×1 matrix for a histogram. So, we would have four 9×1 matrices or a
single 36×1 matrix. To normalize this matrix, we will divide each of these values by the
square root of the sum of squares of the values.

Remember: The sum of squared of values is k = √(a12 + a22 + a32 + … + a362) noting that an2
is the nth value in the 36 x 1 matrix of the 16x16 grids we normalize on.
HISTOGRAM OF GRADIENTS (HOG)
Step6: Generate the features for the whole image
Can you guess what would be the total number of features that we will have for the
given image noting that the grids are 8x8 hence normalizing on 16x16 and the size of
image is 64x128?
we have created features for 16×16 blocks of the image. Now, we will combine all these
to get the features for the ﬁnal image. We would have 105 (7x15) blocks of 16×16 for a
single 64×128 image.
Each of these 105 blocks has a vector of 36×1 as features.
The total features for the image would be 105 x 36 x 1 = 3780 features.
HISTOGRAM OF GRADIENTS (HOG)
What happens if initial image size increases with same grid size?
The total number of features representing the images will increase as well which
represents more information in the image but takes more time as well.

Now we can say that HoG describes all the edges and their orientation in the image in
the form of important features called feature descriptors.
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

SIFT is our second feature descriptor which is used widely in image search, object
recognition and tracking. As HoG stressed on describing edges, SIFT stressed on
describing the interest points.
Can you tell me a common element in the following pictures?
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Probably you have said it is the Eiffel tower, The keen-eyed among you will also have
noticed that each image has a different background, is captured from different angles,
and also has different objects in the foreground (in some cases).It doesn’t matter if the
image is rotated at a weird angle or zoomed in to show only half of the Tower. We
naturally understand that the scale or angle of the image may change but the object
remains the same.

SIFT helps us locate these local features in different images which is the interest points
(Key Points) and we can use this feature descriptor as features for our image to detect
object. The major advantage of SIFT features, over edge features or hog features, is that
they are not affected by the size or orientation of the image (invariant to scale, rotation
and illumination changes while being robust to noise as well)
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Steps:
● Constructing scale space.
● Laplacian of Gaussian approximation (DoG).
● Finding key (interest) points.
● Eliminate edges and low contrast regions.
● Assign an orientation to the key points.
● Generate the SIFT features.

Let us now start understanding each step

SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Step1: Constructing scale space

To create a scale space, we use Gaussian blur ﬁlters on the original images to reduce
noise in the image.the texture and minor details are removed from the image and only
the relevant information like the shape and edges remain
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Actually we aren’t doing the Gaussian blur once but progressively with higher sigmas
and on different octaves.
Note: Octaves means different image scaling. First octave is the original image, second
octave is half the size of ﬁrst octave, third octave is half the size of second octave and
so on.
Why we want to resize image? To make our descriptor scale-invariant. This means we
will be searching for these features on multiple scales, by creating a ‘scale space’.
Scale space is a collection of images having different scales(different sigmas), generated
from a single image.
how many times do we need to resize the image and how many subsequent blur images
need to be created for each resized image? The ideal number of octaves should be four,
and for each octave, the number of blur images should be ﬁve.
Check the following slide
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Step2: Laplacian of Gaussian approximation (DoG).

Difference of Gaussians (DoG) is an approximation of making laplacian operation
(Second order derivative) which is computationally expensive. DoG on the other hand is
using the scale space we have created in the ﬁrst step and just subtracting each two
consecutive Gaussians from each other (in the same scale) which is a feature
enhancement algorithm.
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

What the output (DoG) of two Gaussians really means?

Since we subtract higher sigma (more blurred image) from the lower sigma (less blurred
image) then the DoG is actually what was appearing in the less blurred image and
disappeared in the more blurred image which is equivalent to Laplacian operation.
Remember: Gaussian washes out the edges more so subtracting the two images gives us
the detailed image. (Revise on sharpening part if you don’t remember)
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Step3: Finding key (interest) points.

Key point is the pixel we are working on if it has the highest or lowest value among its
26 neighbours. What are the 26 neighbours !!! The 8 neighbours of the pixel around it in
addition to the 9 pixels in the previous and following images in the same octave.
Note: Scale here means different gaussians not different sizes (octaves). We work on 2
DoG only that have higher and lower images.
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Step4: Eliminate edges and low contrast regions.

Key points that are generated in the previous step are too much while knowing that
some of them are edges (not unique but found a lot in any image)or don’t have enough
contrast. We will eliminate those key points to keep only the more intuitive key points
like corners (Revise Harris detector) which actually can be distinguishable and unique
for the image.
- Reject points with bad contrast (DoG smaller than 0.03 in magnitude)
- Reject edges as we have done in the Harris detector by noticing the lambda values
if one of them is much higher than the other.

TAKE CARE that we do this for all octaves and then get the union of all ﬁnal key points of
all included levels (2 middle levels in the 4 DoG levels) in all octaves and place the key
point location on the original image.
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Step6: Assign an orientation to the key points

Calculate the magnitude and orientation at each key point pixel with among nxn grid
similarly to what we have done in the HoG descriptor by getting gradient magnitude (√
(Gx2 + Gy2)) and gradient orientation (tan-1 (Gy / Gx)). After that we make weighted
histogram (histogram with bins>1 and weights) as we have done before so we would
have something like that:
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

This histogram would peak at some point. The bin at which we see the peak will be the
orientation for the keypoint. Also, any peaks above 80% of the highest peak are
converted into a new keypoint. This new keypoint has the same location and scale as
the original but with different orientation. So, orientation can split up one keypoint into
multiple keypoints.
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Step6: Generate the SIFT features.

So far, we have stable key points that are scale-invariant and rotation invariant. In this
step, we will use the neighboring pixels, their orientations, and magnitude, to generate a
unique ﬁngerprint for this keypoint called a descriptor.
This is done by taking a 16×16 window around the keypoint. This 16×16 window is broken
into sixteen 4×4 windows
SCALE INVARIANT FEATURE TRANSFORMATION (SIFT)

Within each 4×4 window, gradient magnitudes and orientations are calculated. These
orientations are put into an 8 bin histogram. Do this for all sixteen 4×4 regions. So you
end up with 16x8 = 128 numbers. Once you have all 128 numbers, you normalize them.
These 128 numbers form the “feature vector”. This keypoint is uniquely identiﬁed by this
feature vector. We will have a feature vector (descriptor) for each key point in the image.

Note:
The higher gradient magnitudes are by default around the key point (This is why it was
identiﬁed as key point from the beginning) which means that pixels in each 4x4 window
that are closer to the key point will contribute more in the histogram (feature vector)
which is what we want.

3-1binary Images Morphology Thresholding
No ratings yet
3-1binary Images Morphology Thresholding
65 pages
06 Features
No ratings yet
06 Features
94 pages
Interest Points: Computer Vision Jia-Bin Huang, Virginia Tech
No ratings yet
Interest Points: Computer Vision Jia-Bin Huang, Virginia Tech
104 pages
9 Vision Lec 6
No ratings yet
9 Vision Lec 6
58 pages
L06-Corner Detectors
No ratings yet
L06-Corner Detectors
42 pages
Chapter - 5 Features, edge linking
No ratings yet
Chapter - 5 Features, edge linking
66 pages
06 Local Features
No ratings yet
06 Local Features
123 pages
3 Corners Blobs Descriptors
No ratings yet
3 Corners Blobs Descriptors
104 pages
AIS412 - Lecture 2
No ratings yet
AIS412 - Lecture 2
81 pages
CS4670: Computer Vision: Lecture 5: Feature Detection and Matching
No ratings yet
CS4670: Computer Vision: Lecture 5: Feature Detection and Matching
46 pages
CSE 185 Introduction To Computer Vision: Local Invariant Features
No ratings yet
CSE 185 Introduction To Computer Vision: Local Invariant Features
57 pages
computer_vision_2_feature_extraction_2_students
No ratings yet
computer_vision_2_feature_extraction_2_students
70 pages
Features Extraction Dr.tamizhselvan
No ratings yet
Features Extraction Dr.tamizhselvan
56 pages
CV - Lecture 3 - Image Features
No ratings yet
CV - Lecture 3 - Image Features
23 pages
4.01 08 2022 - FeatureDescriptors
No ratings yet
4.01 08 2022 - FeatureDescriptors
46 pages
lecture5-1
No ratings yet
lecture5-1
40 pages
SRM Ramapuram Digital image processing Unit 5 DIP (1)
No ratings yet
SRM Ramapuram Digital image processing Unit 5 DIP (1)
41 pages
cv2021 Lec2 Features I - 1600 - PDF - Gdrive.vip
No ratings yet
cv2021 Lec2 Features I - 1600 - PDF - Gdrive.vip
68 pages
4lec04 Harris for Web
No ratings yet
4lec04 Harris for Web
50 pages
Lecture 03
No ratings yet
Lecture 03
82 pages
3local_features
No ratings yet
3local_features
76 pages
Feature Detection and Matching
No ratings yet
Feature Detection and Matching
80 pages
2017 05 12 Image Segmentation
No ratings yet
2017 05 12 Image Segmentation
2 pages
3. Feature Extraction
No ratings yet
3. Feature Extraction
32 pages
CVML Mulakat Notlari
No ratings yet
CVML Mulakat Notlari
8 pages
Local_Features___Harris_Corner_Detection
No ratings yet
Local_Features___Harris_Corner_Detection
8 pages
Seconf Pub Lish
No ratings yet
Seconf Pub Lish
7 pages
Corner Detector in Computer Vision
No ratings yet
Corner Detector in Computer Vision
57 pages
unit 4
No ratings yet
unit 4
238 pages
Features
No ratings yet
Features
60 pages
Unit II - Chapter 4 - Feature Detection
No ratings yet
Unit II - Chapter 4 - Feature Detection
56 pages
Lec5. Keypoint Detection
No ratings yet
Lec5. Keypoint Detection
80 pages
3
No ratings yet
3
130 pages
Unit 4 Int345
No ratings yet
Unit 4 Int345
45 pages
Unit 4
No ratings yet
Unit 4
39 pages
Image Features and Descriptors
No ratings yet
Image Features and Descriptors
55 pages
Lecture 7 AI Summary
No ratings yet
Lecture 7 AI Summary
45 pages
CV Unit 3
No ratings yet
CV Unit 3
41 pages
lecture5-2 (1)
No ratings yet
lecture5-2 (1)
26 pages
A409163882_29458_9_2025_Unit-4
No ratings yet
A409163882_29458_9_2025_Unit-4
89 pages
GNR602-Lec14-15 Harris-HoG-SIFT
No ratings yet
GNR602-Lec14-15 Harris-HoG-SIFT
86 pages
Free Writing 2_1746419560
No ratings yet
Free Writing 2_1746419560
8 pages
Feature Detection: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
No ratings yet
Feature Detection: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
54 pages
IP_FeatureExtractionEndAnalysis_L7
No ratings yet
IP_FeatureExtractionEndAnalysis_L7
63 pages
L23 FeatureDetection
No ratings yet
L23 FeatureDetection
29 pages
Lecture 5 Stitching Blending
No ratings yet
Lecture 5 Stitching Blending
75 pages
Feature Extraction: Corners and Blobs
No ratings yet
Feature Extraction: Corners and Blobs
62 pages
6 2c. Corner Detection 14-08-2024
100% (1)
6 2c. Corner Detection 14-08-2024
36 pages
lecture13
No ratings yet
lecture13
12 pages
Invicta-2020 Day12
No ratings yet
Invicta-2020 Day12
61 pages
Unit-IV CC
No ratings yet
Unit-IV CC
167 pages
Overview and Implementation of Fast Corner Detection Method
No ratings yet
Overview and Implementation of Fast Corner Detection Method
11 pages
Interest Points Detection
No ratings yet
Interest Points Detection
76 pages
IT5409 Ch4 Part2 Feature ExtractionMatching
No ratings yet
IT5409 Ch4 Part2 Feature ExtractionMatching
85 pages
Feature Detection and Matching
No ratings yet
Feature Detection and Matching
78 pages
Image Segmentation
No ratings yet
Image Segmentation
20 pages
CV Lecture 06 FeatureDetection
No ratings yet
CV Lecture 06 FeatureDetection
75 pages
CHAP 7 Features Recognition and Classification
No ratings yet
CHAP 7 Features Recognition and Classification
94 pages
EECE 5639 Computer Vision I: Edge Detection, Corners Hw2 Has Been Posted
No ratings yet
EECE 5639 Computer Vision I: Edge Detection, Corners Hw2 Has Been Posted
59 pages
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet
Harris Corner Detector: Unveiling the Magic of Image Feature Detection
From Everand
Harris Corner Detector: Unveiling the Magic of Image Feature Detection
Fouad Sabry
No ratings yet
Sample PDF File For Magazines: WWW - Kroonpress.ee/icc
No ratings yet
Sample PDF File For Magazines: WWW - Kroonpress.ee/icc
1 page
RS125 96-98 Wiring PDCI-12 Ver2-1
No ratings yet
RS125 96-98 Wiring PDCI-12 Ver2-1
1 page
1-2020 New Design Pajamas Catalog
No ratings yet
1-2020 New Design Pajamas Catalog
8 pages
Baby Ad PDF
No ratings yet
Baby Ad PDF
2 pages
Kodak Dryview 6800
No ratings yet
Kodak Dryview 6800
4 pages
TFT 2.8 - QuickGuide - Guia Rapida - Guide Rapide
No ratings yet
TFT 2.8 - QuickGuide - Guia Rapida - Guide Rapide
4 pages
Morphological Image Processing
No ratings yet
Morphological Image Processing
45 pages
XYZ Lab Color Space
No ratings yet
XYZ Lab Color Space
15 pages
Explosives Rules, 2010 PDF
No ratings yet
Explosives Rules, 2010 PDF
203 pages
CPDprogram Radtech92217
No ratings yet
CPDprogram Radtech92217
15 pages
Paleta de Colores
No ratings yet
Paleta de Colores
6 pages
Chương 7 - Trắc nghiệm kiến thức - Attempt review
No ratings yet
Chương 7 - Trắc nghiệm kiến thức - Attempt review
12 pages
Chapter III - Image Enhancement
100% (1)
Chapter III - Image Enhancement
64 pages
Unit 2 MM
No ratings yet
Unit 2 MM
11 pages
What Is Texture in Graphic Design
No ratings yet
What Is Texture in Graphic Design
10 pages
Image Processing & Computer Vision
No ratings yet
Image Processing & Computer Vision
21 pages
Ryobi 3200 Ccdspecs
No ratings yet
Ryobi 3200 Ccdspecs
1 page
Air Dry Paint 99489-07A
No ratings yet
Air Dry Paint 99489-07A
22 pages
Color Schemes and The Color Wheel
No ratings yet
Color Schemes and The Color Wheel
7 pages
Athleisure Polyester Spandex Webversion
No ratings yet
Athleisure Polyester Spandex Webversion
9 pages
GELITE1
No ratings yet
GELITE1
15 pages
Computer Science Bacs2173 Graphics Programming
No ratings yet
Computer Science Bacs2173 Graphics Programming
6 pages
Happy Easter Mouse Printable Pack - A Little Pinch of Perfect
No ratings yet
Happy Easter Mouse Printable Pack - A Little Pinch of Perfect
20 pages
Foursquare Basic Tees: Happybiri
No ratings yet
Foursquare Basic Tees: Happybiri
12 pages
Digital Image Fusion
No ratings yet
Digital Image Fusion
20 pages
Pengolahan Citra Digital Untuk Menghitung Luas Daerah Bekas Penambangan Timah
No ratings yet
Pengolahan Citra Digital Untuk Menghitung Luas Daerah Bekas Penambangan Timah
9 pages
Ieee Format
No ratings yet
Ieee Format
6 pages
Image-Based Surface Crack Inspection and Pothole Depth Estimation
No ratings yet
Image-Based Surface Crack Inspection and Pothole Depth Estimation
11 pages
Color Interpolation Algorithmss
No ratings yet
Color Interpolation Algorithmss
15 pages

Classical Computer Vision - Session 2

Uploaded by

Classical Computer Vision - Session 2

Uploaded by

TEXTURE FEATURES

(u, v) is the shifting in x & y directions.

In your opinion, what output of E(u, v) makes it corner or not?

The output of Harris detector is rotation invariance so R is invariant to image rotation as

HAAR Cascades is an algorithm for object detection regardless of objects’ location or

Step1: Calculate Haar features

Step2: Create integral images

Can you guess what is the gain of using integral images?

Step3: Apply AdaBoost

Step4: Implement attention cascade classiﬁers

Check the visualization in the next slide.

Let us now take it step by step and understand it.

Gradient in x-direction at red pixel is (Gx) = 89-78 = 11

Let us now start understanding each step

Step1: Constructing scale space

Step2: Laplacian of Gaussian approximation (DoG).

What the output (DoG) of two Gaussians really means?

Step3: Finding key (interest) points.

Step4: Eliminate edges and low contrast regions.

Step6: Assign an orientation to the key points

Step6: Generate the SIFT features.

You might also like