0% found this document useful (0 votes)

365 views24 pages

Computer Vision Notes: Confirmed Midterm Exam Guide (Kisi-Kisi UTS)

The document provides information about topics that may be covered on a midterm exam for a computer vision course, including point-based processing techniques like image transformation, filtering, and histogram equalization, as well as area-based techniques like the Canny edge detector and Harris corner detector. It then goes into further detail about various image transformation techniques like rotation, shearing, scaling, and translation using homogeneous coordinates, and explains histogram equalization and different intensity transformation methods like negation, log transformation, and power law transformation.

Uploaded by

yeni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

365 views24 pages

Computer Vision Notes: Confirmed Midterm Exam Guide (Kisi-Kisi UTS)

Uploaded by

yeni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Computer Vision Notes

Bilingual for convenience and don’t forget to bring calculator

Confirmed Midterm Exam Guide (Kisi-kisi UTS)

● Point-based processing: Image transformation, Histogram equalization
● Area-based processing: Filtering → Convolution and Correlation
● Canny Edge Detector (Explain how the edge detector works, step-by-step)
● Harris Corner Detector (Most probably explaining how it works step-by-step again)
● Case: SIFT and SURF explanation (according to Pak Diaz), paper-related in t he following link.

Point-based Processing

Image Transformation (Transformasi Citra)

For an in-depth explanation, you may open the following link.
Untuk penjelasan lebih lanjut, Anda dapat membuka t autan berikut ini.

Image transformation can be achieved through matrix multiplication.
Transformasi citra dapat dilakukan melalui perkalian matriks.

Rotation (Rotasi)
The following formula is used to rotate an image where 𝛩 (Theta) is the angle of rotation.
Rumus berikut digunakan untuk merotasi sebuah citra di mana 𝛩 (Theta) adalah sudut rotasi.

Easy way to remember rotation

Let say you want to rotate this vector 90 degree 2 times counterclockwise:

In the first rotation, the coordinate of this vector becomes:

In the second rotation, the coordinate of this vector becomes:

Now mathematically , we can do this 90 degree rotation by multiplying some unknown 2x2 matrix with the
vector 2 times:

First multiplication :

The result would be

Second multiplication :

The result would be

The full result matrix a, b , c ,d is

Since cos 90 = 0, sin 90 = 1, -(sin 90) = -1, we can guess and transform full result matrix into :

Hurray ＼(^ω^＼)

Shearing (Shear)
Shearing (a.k.a. skewing) is an operation that displaces a line vertically or horizontally depending on the
shear matrix used.
Shear (dikenal juga dengan skew) merupakan transformasi yang menggeser garis secara vertikal atau
horizontal bergantung pada shear matrix yang digunakan.

There are two types of shearing:
Terdapat dua jenis shear:

● Vertical
This type of shearing displaces lines vertically, depending on the value of 𝛼 and 𝑥.
Jenis shear ini menggeser garis secara vertikal, tergantung pada besar nilai 𝛼 dan 𝑥.

● Horizontal
This type of shearing displaces lines horizontally, depending on the value of 𝛼 and 𝑦.
Jenis shear ini menggeser garis secara horizontal, tergantung pada besar nilai 𝛼 dan 𝑦.

Scaling
A transformation that enlarges or shrinks the image by a certain scale (constant).
Sebuah transformasi yang memperbesar atau memperkecil citra dengan suatu skala (konstanta) tertentu.

There are two kinds of scaling transformations:
Terdapat dua jenis transformasi scaling:

● Uniform/Isotropic scaling (Scaling dengan konstanta yang sama)
This type of scaling uses the same scale factor for the 𝑥 and 𝑦 components of the vector.
Jenis scaling ini menggunakan faktor skala yang sama untuk komponen 𝑥 dan 𝑦 dari vektor.

● Non-uniform/Anisotropic Scaling (Scaling dengan konstanta yang berbeda)
This type of scaling uses different scale factors for the 𝑥 and 𝑦 components of the vector.
Jenis scaling ini menggunakan faktor skala yang berbeda untuk komponen 𝑥 dan 𝑦 dari citra.

Translation (Translasi)
A transformation that moves every component of the image by a given distance and cannot be written as
the multiplication of a 2x1 matrix with a 2x2 matrix.
Sebuah transformasi yang memindahkan setiap komponen dari citra dengan jarak tertentu dan tidak dapat
dituliskan sebagai perkalian matriks 2x1 dengan matriks 2x2.

● Homogeneous Coordinates (Koordinat Homogen)

To allow translation, the image must use homogeneous coordinates where the 2D vector is

represented as a 3D vector , with 𝑧 acting as a scale for the 𝑥 and 𝑦 components.

Agar dapat melakukan translasi, citra harus menggunakan koordinat homogen di mana vektor 2D

direpresentasikan dalam bentuk vektor 3D , di mana 𝑧 berfungsi sebagai skala untuk komponen 𝑥 dan
𝑦.

● Translation with Homogeneous Coordinates (Translasi dalam Koordinat Homogen)
Translation can be written as the product of a homogenous vector with a 3x3 matrix (with 𝑧 = 1)
Translasi dapat dituliskan dalam persamaan perkalian vektor homogen dengan matriks 3x3 (dengan
komponen 𝑧 = 1)

Where 𝑥 is moved by 𝛼 units and 𝑦 by 𝛽.

Converting a 2x2 matrix to 3x3 for homogeneous coordinates (Konversi matriks 2x2
menjadi 3x3 untuk koordinat homogen)

The transformation matrix can be converted to a 3x3 matrix for transformation with homogeneous
coordinates, that is:

Matriks transformasi dapat diubah menjadi matriks 3x3 untuk transformasi dalam koordinat
homogen menjadi:

Histogram Equalization
To calculate the equalized histogram, use CDF (Cumulative Distribution Function)

By calculating the CDF, we can obtain the normalized frequency of every intensity by rounding down every
fN result

L = intensity count
fk = cumulative frequency

Example:

Intensity fk CDF fN Intensity New fk
0 2 2/25 0.56 1 2
1 4 6/25 1.68 2 4
2 5 11/25 3.08 3 5
3 2 13/25 3.64 4 ↓
4 3 16/25 4.48 4 2 + 3 = 5
5 3 19/25 5.32 5 3
6 3 22/25 6.16 6 3
7 3 25/25 7 7 3

Intensity Transformation (point operators)

Image Negative

Where s is the output intensity value for input intensity r.

Using the equation above we reverse the intensity levels of an image to produce equivalent of image
negatives.

This type of processing is suited for enhancing white or gray detail covered mostly of dark background in
an image.

Log Transformation

Where c is a constant (usually 1) and

This type of transformation is suited for expanding the dark values in an image while compressing the high
intensity values.

We can see from Figure 3.3 that :
● Log function maps low input intensity value to wide range of output intensity level and map high input
intensity value to narrow range of output intensity level.
● Inverse log function is the opposite of log function (low intensity -> narrow output, high intensity -> wide
output).

Power Law (Gamma) Transformation

Where s is the output intensity value, c and are positive constant.

● Gamma Transformation is more versatile than log transformation for compressing intensity values.
● A variety of devices used for image capture, printing and display respond according to power law. The
process used to correct these power-law response phenomena is called gamma correction.

We can see from Figure 3.6 that:
● Fractional values ( ) maps narrow range of low intensity input value to wider range of
output intensity value while the opposite is true for the high intensity input value. (lowering the
fractional gamma values might reduce the contrast of an image and might make image look
“washed-out”).
● that is bigger than 1 maps wide range of intensity input value to narrow range of output intensity
value while the opposite is true for the high intensity input value.
● Gamma Transformation become identity transformation when is 1

Example of gamma correction:

A CRT device have an intensity-to-voltage response that is a power function, with exponents 2.5. By looking
at the Figure 3.6 ( = 2.5) the response produced by the CRT device tends to produce image that is darker.
We see in the Figure 3.7 (b) that indeed the image viewed in the CRT monitor is darker than the original
image in Figure 3.7(a).

Thus we need to do gamma correction by applying Power Law Transformation to the original image by

with c = 1 before we display it in the CRT monitor.

Gamma correction is useful for:
● Displaying an image accurately on a computer screen.
● Reproduce color of an image correctly (gamma values not only change the intensity value but also the
ratio of Red, Green, Blue in a color image).

Histogram Equalization

Probability of occurrence of input intensity level in a digital image is approximated by :

Where
● means input pixel r with intensity level j (0-255 or 0-[L-1] where L is the color bit depth or the number
of bins).
● is the number of pixel that have intensity level j.
● M is the number of pixel row and N is the number of pixel column (for example the image resolution is
640 x 480 then MN = 307200).
The discrete form of Transformation function CDF equation is:

Where (located in output image) is mapping from each corresponding pixel (located in input image).

Example :
Let say that there is an 3-bit image represented in 5x5 matrix :

5 6 3 1 5

1 2 5 3 3

6 4 1 7 7

3 4 0 6 2

2 7 5 0 5

We can calculate the frequency of each intensity value:

3

Since it’s contains 8 different intensity value then (L-1) = (8-1) = 7.
MN = 5x5 = 25

The equation becomes:

Calculate each s from 0 to 7:
= 7/25 * 2 = 0.56
= 7/25 * (2 + 3) = 1.4
= 7/25 * (2 + 3 + 3) = 2.24
= 7/25 * (2 + 3 + 3 + 4) = 3.36
= 7/25 * (2 + 3 + 3 + 4 + 2) = 3.92
= 7/25 * (2 + 3 + 3 + 4 + 2 + 5) = 5.32
= 7/25 * (2 + 3 + 3 + 4 + 2 + 5 + 3) = 6.16
= 7/25 * (2 + 3 + 3 + 4 + 2 + 5 + 3 + 3) = 7

Round all the fraction result since there is no way that pixel values is a fraction (IIRC PaoPao said by
flooring):

= 0 (no changes)
= 1 (no changes)
= 2 (no changes)
= 3 (no changes)
= 3 (changed)
= 5 (no changes)
= 6 (no changes)
= 7 (no changes)

Since only the pixel of intensity 4 mapped into 3 in the output image we replace intensity 4 by 3 and the
output image matrix become (changes in red):

5 6 3 1 5

1 2 5 3 3

6 3 1 7 7

3 3 0 6 2

2 7 5 0 5

(this is a really shitty example since the distribution of histogram is pretty balanced in the first place).

Spatial Transformation (Neighbours operation)

Definition of filters

There are two kinds of filter:
● Low Pass filter is a filter that passes low frequencies, the effect produced by this filter is
blurring/smoothing an image (also called a veraging filters).
● High Pass filter is a filter that passes high frequencies, the effect produced by this filter is s
harpening
(if the result of the filter is added to the original image).

We can achieve these effects by using spatial filters (also called spatial mask), spatial filters consist of :
1. A neighbourhood typically a small rectangle.
2. A predefined operation that is performed on the images pixels encompassed by the neighbourhood.

Spatial Filtering c reates a new pixel (in output image) with coordinates equal to the coordinate of the
center of the neighbourhood. If the operation performed on the image is linear then the filter is called linear
spatial filter otherwise the filter is nonlinear.
Spatial Correlation and Convolution

There are two methods of spatial filtering:
1. Correlation which is a process of moving filter mask over the image and computing the sum of
products at each location.
2. Convolution i s the same as correlation but the filter mask is rotated 180 degree before convolving.

Note that if the filter mask is s ymmetric then correlation and convolution will lead to the same result.
(leftmost column is symmetric)

Here the step by step video on how to convolve a mask with an image:
https://youtu.be/XuD4C8vJzEQ?t=185

Smoothing Spatial Filters (averaging)

● Smoothing is analogous to i ntegration
● Smoothing filters are used for b
lurring(removal of small details in image) and noise reduction.

Blurring because replacing the pixel in the image by the average intensity level in the neighbourhood lead to
reduced s harp transition in intensities between adjacent pixels and this also lead to N
oise reduction.
However edges (which is also characterized as sharp transition in intensities) is also blurred.

● The mask in figure 3.32(a) is called box filter because all the coefficient in the matrix is the same.
● The mask in figure 3.32(b) is called weighted average filter, t his terminology is used to indicate that
pixels are multiplied by different coefficients and thus giving more importance/weight to some pixels
(in this case the closer the pixel to the center, the bigger the coefficient is).

Sharpening Spatial Filters

● Sharpening is analogous to differentiation.
● Sharpening filter are based on f irst order derivative and second order derivative.
● All the coefficient in the mask must sum to 0 (image with constant intensity must have zero derivative).

Unsharp Masking and High Boost Filtering

Unsharp Masking is sharpening an image by subtracting an unsharp (smoothed) version of the original
image from the original image.
High Boost filtering is multiplying the mask created from unsharp masking by a constant K > 1.

The process of unsharp masking and high boost filtering are:
1. Blur the original image.
2. Subtract the blurred image from the original.
3. Multiply the mask by some constant k > 1.
4. Add the multiplied mask to the original image.

Edge Detection
How does a computer define an edge? It's the sudden change in colour or intensity of colour.
Mathematical definition: Edge is the zero crossing point of the second derivative as illustrated below.

All you have to understand is that the derivative is the gradient/kemiringan at any point in the graph.
First derivative graph is graphed by calculating gradient at every point in the colour intensity graph.
Second derivative graph is graphed by made by calculating gradient at all point in the first derivative graph.
First and second order derivative

First order derivative in digital image is defined as partial derivative with respect to both x and y :

And for second order derivative:

Laplacian Edge Detection

The second order derivative in image processing are implemented using laplacian operator.

Derivative laplacian operator can be defined as the sum of second order derivative with respect to x and y :

The equation above can be implemented into a filter mask by 3x3:

● Left mask doesn’t take into account diagonal pixel when computing the derivative and invariant to 90
degree rotation.
● Right mask is the extension of the original equation and also take into account the diagonal pixel when
computing the derivative and i nvariant to 90 and 45 degree rotation.
● Rotation invariant mask is called isotropic filter.
● We can sharpen an image by adding the result of filtered image by laplacian mask to the original image.

The first order derivative in image processing is implemented by using sobel mask and others.

Sobel Edge Detection

● The mask in the first row left column compute the derivative in h
orizontal direction.
● The mask in the first row right column compute the derivative in v ertical direction.
● The mask in the second row compute the derivative in d iagonal direction.
● Sobel also smooth the image when differentiating.

Canny Edge Detector

The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a
wide range of edges in images.

Canny edge detection algorithm consist of the following steps:

1. Noise reduction with Gaussian Filter
Convolving the image with the Gaussian filter to remove noise on the image since edge detection is
highly sensitive to noise
2. Gradient magnitude
First, we detect edge intensity and direction by calculating the gradient of the image using the Sobel
filter in x and y axis. Then, we calculate the magnitude by finding the hypotenuse of derivatives of the x
and y axis. Finally, we calculate the degree/magnitude of the gradient by calculating the tangent of
derivatives of the x and y axis.
3. Non-maxima suppression
Non-maxima suppression is a method to eliminate spurious (read:false) edges and corners in Canny
and Harris. By using this, you’ll get only the true edges and corners. Basically, NMS eliminates spurious
edges or corners by checking the direction of the edge and corner then checking the surrounding pixels
to eliminate pixels with low intensity and keep the high intensity pixels.
4. Double thresholding
Set high threshold to identify strong pixel (higher intensity than high threshold), low threshold to identify
non-relevant pixel (lower intensity than low threshold). The pixel which has intensity between high and
low threshold will be flagged.
5. Hysteresis thresholding
Hysteresis help us to identify the flagged pixels in double threshold considered as strong pixels or
non-relevant pixels. It will transform the weak pixel into strong pixel if there is at least one strong pixel
around that weak pixel.

More details about Canny in h ere!
Harris Corner Detector

Harris corner detector algorithm consist of the following steps:
1. Convert the image to grayscale and compute the image derivative (optionally smooth it first).
2. Find the second moment matrix / structure tensor matrix M by approximating the response difference
by using first order taylor expansion.
3. Plug the eigenvalues of matrix M to the corner response function to get response value R.
4. Perform non-maxima suppression to the list of candidate corners (R) and find the correct corner.
(gonna explain all of this later, maybe)

Step 1:
See APPENDIX A for gradient and Smoothing Spatial filters for smoothing.

Step 2:
In this step we're gonna use SSD (Sum of Square Difference) to extract the structure tensor matrix. The
purpose of this is to find the biggest response difference when we move the window in any direction
(finding a candidate corner point in a nutshell) [see the image below.]

This window operation is mathematically defined as:

● E is the difference between the original and the moved window.
● u is the window's displacement in the x direction
● v is the window's displacement in the y direction
● w(x, y) is the window at position (x, y). This acts like a mask. Ensuring that only the desired window is
used.
● I is the intensity of the image at a position (x, y)
● I(x+u, y+v) is the intensity of the moved window
● I(x, y) is the intensity of the original

Lets ignore w(x,y) for now and focus on the square difference

We can approximate the equation I(x+u,y+v) above by using f irst order multivariate taylor expansion. And
the equation becomes.

In the above equation, I(x,y) cancel out so lets expand and the equation becomes:

This can be turned into matrix vector multiplication from (since the summation symbol only depends on the
x,y we can leave out and its transpose outside the summation) :

Now we can extract the equation in the parenthesis which is called the structure tensor matrix/second
moment matrix i nto M (also add back w(x,y) since it also depends on the summation).:

Now the window operation simplified into :

Step 3:

Compute the eigenvalues from every M matrix (with varying x,y coordinate).

Forgot how to calculate eigen value? Here’s a refresher:

Then plug it to the response function

Response function is defined as :

Step 4:

Perform NMS to the list of R corner coordinates to find the best corner and eliminate unnecessary corners
candidate that do not lie on the ‘true edges’ (see APPENDIX A).3

Blob Detection
In computer vision, blob detection methods are aimed at detecting regions in a digital image that differ in
properties, such as brightness or color, compared to surrounding regions.

Methods:
● Laplacian of Gaussian (LoG)
● Difference of Gaussians (DoG)
● Determinant of Hessian (DoH)

Laplacian of Gaussian

Given input image I, create gaussian blurred version of it G. Convolving I and G using Laplacian operator
(taking second derivative, that is the very definition of edge if you remember) gives you LoG. (source:
http://fourier.eng.hmc.edu/e161/lectures/gradient/node8.html)

Difference of Gaussians

Given input image I, create multiple gaussian blurred version of it with different k-sizes and take the
differences between them using laplacian operator. (SIFT uses this)

Determinant of Hessian

Hessian operator is simply put a better version of Laplacian operator. (SURF uses this)
(source: https://en.wikipedia.org/wiki/Blob_detection#The_determinant_of_the_Hessian)

This is simply because Hessian operators contain more information since it contains all the possible
second-order partial derivatives where Laplacian operators only store information about the sum of
second-order partial derivatives. Hessian matrix looks like below matrix:

Hough Transform
The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital
image processing.[1]
The purpose of the technique is to find imperfect instances of objects within a certain
class of shapes by a voting procedure. This voting procedure is carried out in a p arameter space, from
which object candidates are obtained as local maxima in a so-called accumulator space that is explicitly
constructed by the algorithm for computing the Hough transform.
The classical Hough transform was concerned with the identification of l ines in the image, but later the
Hough transform has been extended to identifying positions of arbitrary shapes, most commonly circles or
ellipses.

Image Descriptors
● Most features can be thought of as templates, histograms (counts), or combinations
● The ideal descriptor should be
○ Robust and Distinctive
○ Compact and Efficient
● Most available descriptors focus on edge/gradient information
○ Capture texture information
○ Color rarely used

Main Components
1. Detection: Identify the interest points
2. Description: Extract vector feature descriptor surrounding each interest point.
3. Matching: Determine correspondence between descriptors in two views
Scale Invariant Feature Transform (SIFT)
https://en.wikipedia.org/wiki/Scale-invariant_feature_transform

The scale-invariant feature transform (SIFT) is a f eature detection algorithm in
computer vision to detect and describe local features in images published by David
Lowe in 1999. SIFT can robustly identify objects even among clutter and under partial
occlusion, because the SIFT feature descriptor is invariant to u niform scaling,
orientation, illumination changes, and partially invariant to a ffine distortion.

Affine distortion example:
An image of a f ern-like f ractal that exhibits affine s elf-similarity.

SIFT keypoints of objects are first extracted from a set of reference images and stored
in a database. An object is recognized in a new image by individually comparing each feature from the new
image to this database and finding candidate matching features based on Euclidean distance of their
feature vectors.
Key locations are defined as maxima and minima of the result of d ifference of Gaussians (DoG) function
applied in s
cale space to a series of smoothed and resampled images.

SIFT uses a modified version of kd-tree (binary search tree that stores k-dimension koordinates) called the
best-bin-first search method[14] that can identify the n
earest neighbors with high probability using only a
limited amount of computation. The BBF algorithm uses a modified search ordering for the k-d tree
algorithm so that bins in feature space are searched in the order of their closest distance from the query
location. This search order requires the use of a heap-based priority queue for efficient determination of the
search order. The best candidate match for each keypoint is found by identifying its nearest neighbor in the
database of keypoints from training images. The nearest neighbors are defined as the key points with
minimum Euclidean distance from the given descriptor vector. The probability that a match is correct can
be determined by taking the ratio of distance from the closest neighbor to the distance of the second
closest.

Lowe[3] rejected all matches in which the distance ratio is greater than 0.8, which eliminates 90% of the
false matches while discarding less than 5% of the correct matches. To further improve the efficiency of
the best-bin-first algorithm search was cut off after checking the first 200 nearest neighbor candidates. For
a database of 100,000 keypoints, this provides a speedup over exact nearest neighbor search by about 2
orders of magnitude, yet results in less than a 5% loss in the number of correct matches.
SIFT uses Hough Transform to identify clusters of features with a consistent interpretation by using each
feature to vote for all object poses that are consistent with the feature. When clusters of features are found
to vote for the same pose of an object, the probability of the interpretation being correct is much higher
than for any single feature.

Finally, Outliers can now be removed by checking for agreement between each image feature and the
model, given the parameter solution. Given the l inear least squares solution (linear regression), each match
is required to agree within half the error range that was used for the parameters in the H ough transform
bins. As outliers are discarded, the linear least squares solution is resolved with the remaining points, and
the process iterated. If fewer than 3 points remain after discarding o utliers, then the match is rejected. In
addition, a top-down matching phase is used to add any further matches that agree with the projected
model position, which may have been missed from the Hough transform bin due to the similarity transform
approximation or other errors.

The final decision to accept or reject a model hypothesis is based on a detailed probabilistic model.[15] This
method first computes the expected number of false matches to the model pose, given the projected size
of the model, the number of features within the region, and the accuracy of the fit. A Bayesian probability
analysis then gives the probability that the object is present based on the actual number of matching
features found. A model is accepted if the final probability for a correct interpretation is greater than 0.98.
Lowe's SIFT based object recognition gives excellent results except under wide illumination variations and
under non-rigid transformations.

SIFT consist of the following this steps:
1. Scale-space extrema detection
Use Difference of Gaussian (DoG) to identify potential interest points, which were invariant to scale
and orientation
2. Keypoint localization
Reject low contrast points and eliminate edge responses
3. Orientation assignment
Each keypoint is assigned one or more orientations based on local image gradient direction to
achieve invariance to rotation
4. Keypoint descriptor
Compute a descriptor vector for each keypoints for the descriptor that is highly distinctive
andpartially invariant to remaining variations

Speeded up robust features (SURF)

It is partly inspired by the scale-invariant feature transform (SIFT) descriptor. The standard version of SURF
is several times faster than SIFT and claimed by its authors to be more robust against different image
transformations than SIFT, but SURF is sometimes more inaccurate when faced with rotations. It’s also
partially invariant to a
ffine distortion like SIFT, meaning sometimes it can be inaccurate.
To detect interest points, SURF first uses Gaussian Blur uses an integer approximation of the determinant
of Hessian blob detector, which can be computed with 3 integer operations using a precomputed integral
image. Its feature descriptor is based on the sum of the H
aar wavelet response around the point of interest.
These can also be computed with the aid of the integral image.

APPENDIX A
Image gradient

Lets define the derivative with respect to x as and with respect to y as :

(see F
irst and second order derivative section for explanation on the derivation).

To find edge strength and direction at location (x,y) in image f we need to compute the gradient :

The magnitude (length) of vector denoted as M(x,y) which is euclidean distance :

The direction of the gradient vector is given by angle at point (x,y) with respect to x axis:

We can use gradient operators to compute edge direction and strength (illustrated below):

Non-Maxima suppression

Example :

Let , , and denote four basic edge direction for 3x3 region which are
Horizontal (0 and +180) , -45 degree, vertical (+90 and -90) and +45 degree respectively. We can formulate
the following non maxima suppression scheme for 3x3 region centered at point (x,y) in ):
1. Compute gradient magnitude M(x,y) and angle .
2. Find the direction that is closest to
a. For example : if then the closest direction from (edge normal) is
horizontal since (20 - 0) = 20 , (45-20) = 25.
b. Since edge direction is perpendicular to edge normal, the edge direction is 0 + 90 = +90 degree and
0 - 90 = -90 degree (vertical direction).
3. If the value of M(x,y) is less than at least on of its two neighbors along , let f(x,y) = 0 (suppression)
otherwise, let f(x,y) = M(x,y).
a. Continuing the example in 2, the two neighbors along the vertical direction is (x,y+1) and (x,y-1).

Characteristics of Good Features detector (maybe come out in theory)
● Repeatability
○ The same feature can be found in several images despite geometric and photometric
transformations
● Saliency
○ Each feature is distinctive
● Compactness and efficiency
○ Many fewer features than image pixels
● Locality
○ A feature occupies a relatively small area of the image; robust to clutter and occlusion
Criteria for Optimal Edge Detection (this too)
● Good detection
○ The optimal detector must minimize the probability of false positives (detecting spurious edges
caused by noise), as well as that of false negatives (missing real edges)
● Good localization
○ The edges detected must be as close as possible to the true edges
● Single response constraint
○ The detector must return one point only for each true edge point, that is, minimize the number of
local maxima around the true edge (created by noise)

Department of Computer Science and Engineering (CSE)
No ratings yet
Department of Computer Science and Engineering (CSE)
11 pages
Image Compression with DCT Tutorial
100% (1)
Image Compression with DCT Tutorial
10 pages
CCS338 Lab Manual Final
No ratings yet
CCS338 Lab Manual Final
7 pages
Computer Vision & Image Processing Assignment
100% (1)
Computer Vision & Image Processing Assignment
13 pages
Robotics and Machine Vision Internal 3 Important Questions
No ratings yet
Robotics and Machine Vision Internal 3 Important Questions
1 page
Convert CFG to Chomsky Normal Form
No ratings yet
Convert CFG to Chomsky Normal Form
4 pages
CV Lab Manual
No ratings yet
CV Lab Manual
45 pages
Morphological Image Processing Guide
No ratings yet
Morphological Image Processing Guide
58 pages
Computer Vision
No ratings yet
Computer Vision
13 pages
Computer Vision
No ratings yet
Computer Vision
3 pages
Circle Generation Algorithm
No ratings yet
Circle Generation Algorithm
10 pages
IPCV Unit 04
No ratings yet
IPCV Unit 04
12 pages
CV Unit-4
No ratings yet
CV Unit-4
11 pages
Module 1:image Representation and Modeling
No ratings yet
Module 1:image Representation and Modeling
48 pages
Thyroid Disease Classification Using Machine Learning Project
No ratings yet
Thyroid Disease Classification Using Machine Learning Project
34 pages
Laboratory 1. Working With Images in Opencv
No ratings yet
Laboratory 1. Working With Images in Opencv
13 pages
Digital Image Processing Project Presentation
No ratings yet
Digital Image Processing Project Presentation
40 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
4 pages
Autoencoders & Keras Overview
No ratings yet
Autoencoders & Keras Overview
42 pages
AI Problem Solving for Students
No ratings yet
AI Problem Solving for Students
36 pages
An Introduction To Digital Image Process
No ratings yet
An Introduction To Digital Image Process
233 pages
Digital Image Processing Exam
No ratings yet
Digital Image Processing Exam
4 pages
Harris Corner Detection Guide
No ratings yet
Harris Corner Detection Guide
6 pages
Feature Detection and Matching
No ratings yet
Feature Detection and Matching
80 pages
Feature Extraction & Image Registration
No ratings yet
Feature Extraction & Image Registration
14 pages
Types of Expert Systems
100% (1)
Types of Expert Systems
21 pages
Univ Questions
No ratings yet
Univ Questions
16 pages
Hamming Network
100% (1)
Hamming Network
2 pages
Neighbour of A Pixel
No ratings yet
Neighbour of A Pixel
26 pages
Lab Manual
No ratings yet
Lab Manual
28 pages
Digital Image Processing - Region Based Splitting & Merging
No ratings yet
Digital Image Processing - Region Based Splitting & Merging
13 pages
Eng4Bf3 Medical Image Processing: Image Enhancement in Frequency Domain
No ratings yet
Eng4Bf3 Medical Image Processing: Image Enhancement in Frequency Domain
59 pages
Module 2 IVP
No ratings yet
Module 2 IVP
151 pages
MCA Image Processing Course Guide
No ratings yet
MCA Image Processing Course Guide
239 pages
Assignment 2 DIP 2019
No ratings yet
Assignment 2 DIP 2019
9 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Computer Vision Course Syllabus
No ratings yet
Computer Vision Course Syllabus
92 pages
Computer Vision Midterm Exam
No ratings yet
Computer Vision Midterm Exam
6 pages
Computer Vision I: Ai Courses by Opencv
No ratings yet
Computer Vision I: Ai Courses by Opencv
9 pages
48.DIGITAL IMAGE PROCESSING ppt-1
No ratings yet
48.DIGITAL IMAGE PROCESSING ppt-1
10 pages
CNN Basics for AI Enthusiasts
No ratings yet
CNN Basics for AI Enthusiasts
29 pages
Active Range Finding
No ratings yet
Active Range Finding
6 pages
Al3502 Deep Learning For Vision Lab Manuval
No ratings yet
Al3502 Deep Learning For Vision Lab Manuval
19 pages
Perceptons Neural Networks
No ratings yet
Perceptons Neural Networks
33 pages
Introduction To Computer Vision
No ratings yet
Introduction To Computer Vision
10 pages
Algorithms For Planning: As State-Space Search
No ratings yet
Algorithms For Planning: As State-Space Search
8 pages
Deep Learning in Computer Vision
No ratings yet
Deep Learning in Computer Vision
15 pages
Canny Edge Detection Tutorial PDF
No ratings yet
Canny Edge Detection Tutorial PDF
17 pages
AI Unit2 ProblemSolving
No ratings yet
AI Unit2 ProblemSolving
191 pages
DIP Notes Unit 5
No ratings yet
DIP Notes Unit 5
30 pages
SVM Example
No ratings yet
SVM Example
10 pages
Stereo Vision For Face Recognition Dissertation
100% (1)
Stereo Vision For Face Recognition Dissertation
56 pages
Dip Unit 2
No ratings yet
Dip Unit 2
72 pages
M.Tech Image & Video Processing
No ratings yet
M.Tech Image & Video Processing
1 page
RBF Neural Network
No ratings yet
RBF Neural Network
34 pages
VTT Lab Manual Final 2-4-58 APEC
No ratings yet
VTT Lab Manual Final 2-4-58 APEC
54 pages
Lecture 5
No ratings yet
Lecture 5
57 pages
5CS4-04 U3
No ratings yet
5CS4-04 U3
24 pages
New CH 4
No ratings yet
New CH 4
10 pages
ACE AP Physics 1 by RitvikRustagi
No ratings yet
ACE AP Physics 1 by RitvikRustagi
173 pages
Labview Multicore Systems
No ratings yet
Labview Multicore Systems
86 pages
AAD Lec04
No ratings yet
AAD Lec04
3 pages
Grade 10 Math: Composite Functions
No ratings yet
Grade 10 Math: Composite Functions
2 pages
Collectanea Hermetica 1000149697
100% (2)
Collectanea Hermetica 1000149697
126 pages
FirstRoundReport08 PDF
No ratings yet
FirstRoundReport08 PDF
9 pages
Icmr 2021 Template
No ratings yet
Icmr 2021 Template
3 pages
Dulwich Year 7 Maths Specimen Paper e
No ratings yet
Dulwich Year 7 Maths Specimen Paper e
15 pages
Ch-1 Plane Table Survey
No ratings yet
Ch-1 Plane Table Survey
40 pages
Evolution of Physics (Albert - Einstein)
No ratings yet
Evolution of Physics (Albert - Einstein)
255 pages
21cs502 Unit 4 Ai Notes Short
No ratings yet
21cs502 Unit 4 Ai Notes Short
32 pages
Lesson Note For Second Term SS2 (2024)
No ratings yet
Lesson Note For Second Term SS2 (2024)
20 pages
Uji Q Ketepatan Ketelitian
No ratings yet
Uji Q Ketepatan Ketelitian
23 pages
Number System Sheet-3 - 423278 - Crwill
No ratings yet
Number System Sheet-3 - 423278 - Crwill
4 pages
Circular Shift and Convolution Guide
No ratings yet
Circular Shift and Convolution Guide
4 pages
Fluid Mechanics Lecture Notes
100% (1)
Fluid Mechanics Lecture Notes
21 pages
Organisation of Data Notes
No ratings yet
Organisation of Data Notes
115 pages
Inferential Statistics for Educators
No ratings yet
Inferential Statistics for Educators
101 pages
Technology For Mathematics For Students
No ratings yet
Technology For Mathematics For Students
7 pages
Gibbs Free Energy Explained
No ratings yet
Gibbs Free Energy Explained
22 pages
GR 3 Math Chapter 7
No ratings yet
GR 3 Math Chapter 7
15 pages
Class-9 OPT Math
No ratings yet
Class-9 OPT Math
2 pages
Fem Book 4screen Sol
No ratings yet
Fem Book 4screen Sol
533 pages
Comments To FprEN1991 1-3-2023 Cylindrical Roof
No ratings yet
Comments To FprEN1991 1-3-2023 Cylindrical Roof
15 pages
Reasoning 3 Mock Test in One Class
No ratings yet
Reasoning 3 Mock Test in One Class
88 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
61 pages
Masters 2074
No ratings yet
Masters 2074
17 pages
Wind Pressure Calculation ASCE 7-05
100% (1)
Wind Pressure Calculation ASCE 7-05
8 pages
Determination of Coefficient of Linear Expansion of A Metal Rod
50% (2)
Determination of Coefficient of Linear Expansion of A Metal Rod
5 pages
Teaching and Learning Structural Engineering Analysis With Matlab
No ratings yet
Teaching and Learning Structural Engineering Analysis With Matlab
19 pages

Computer Vision Notes: Confirmed Midterm Exam Guide (Kisi-Kisi UTS)

Uploaded by

Computer Vision Notes: Confirmed Midterm Exam Guide (Kisi-Kisi UTS)

Uploaded by

Computer Vision Notes

Bilingual for convenience and don’t forget to bring calculator

Confirmed Midterm Exam Guide (Kisi-kisi UTS)

Image Transformation (Transformasi Citra)

represented as a 3D vector , with 𝑧 acting as a scale for the 𝑥 and 𝑦 components.

Intensity Transformation (point operators)

Where s is the output intensity value, c and are positive constant.

with c = 1 before we display it in the CRT monitor.

Sharpening Spatial Filters

Canny Edge Detector

Canny edge detection algorithm consist of the following steps:

Now the window operation simplified into :

Forgot how to calculate eigen value? Here’s a refresher:

Response function is defined as :

Speeded up robust features (SURF)

You might also like