Assignment-2:DIP: Mr. Victor Mageto CP10101610245
Assignment-2:DIP: Mr. Victor Mageto CP10101610245
Assignment-2:DIP: Mr. Victor Mageto CP10101610245
1
Q1. Write an essay on three object detection algorithms. Answer:
R-CNN
To bypass the problem of selecting a huge number of regions, Ross Girshick et al.
proposed a method where we use selective search to extract just 2000 regions
from the image and he called them region proposals. Therefore, now, instead of
trying to classify a huge number of regions, you can just work with 2000 regions.
These 2000 region proposals are generated using the selective search algorithm
which is written below.
2
To know more about the selective search algorithm, follow this link. These 2000
candidate region proposals are warped into a square and fed into a convolutional
neural network that produces a 4096- dimensional feature vector as output. The
CNN acts as a feature extractor and the output dense layer consists of the
features extracted from the image and the extracted features are fed into an SVM
to classify the presence of the object within that candidate region proposal. In
addition to predicting the presence of an object within the region proposals, the
algorithm also predicts four values which are offset values to increase the
precision of the bounding box. For example, given a region proposal, the
algorithm would have predicted the presence of a person but the face of that
person within that region proposal could’ve been cut in half. Therefore, the offset
values help in adjusting the bounding box of the region proposal.
3
How YOLO works is that we take an image and split it into an SxS grid, within each
of the grid we take m bounding boxes. For each of the bounding box, the network
outputs a class probability and offset values for the bounding box. The bounding
boxes having the class probability above a threshold value is selected and used to
locate the object within the image.
YOLO is orders of magnitude faster(45 frames per second) than other object
detection algorithms. The limitation of YOLO algorithm is that it struggles with
small objects within the image, for example it might have difficulties in detecting a
flock of birds. This is due to the spatial constraints of the algorithm.
Fast R-CNN
The same author of the previous paper(R-CNN) solved some of the drawbacks of
R-CNN to build a faster object detection algorithm and it was called Fast R-CNN.
The approach is similar to the R-CNN
algorithm. But, instead of feeding the region proposals to the CNN, we feed the
input image to the CNN to generate a convolutional feature map. From the
convolutional feature map, we identify the region of proposals and warp them
into squares and by using a RoI pooling layer we reshape them into a fixed size so
that it can be fed into a fully connected layer. From the RoI feature vector, we use
a softmax layer to predict the class of the proposed region and also the offset
values for the bounding box.
The reason “Fast R-CNN” is faster than R-CNN is because you don’t have to feed
2000 region proposals to the convolutional neural network every time. Instead,
the convolution operation is done only once per image and a feature map is
generated from it.
Q2. With the help of appropriate example give the applications of edge
detection.
Answer:
6
Sobel Operator: It is a discrete differentiation operator. It computes the gradient
approximation of image intensity function for image edge detection. At the pixels
of an image, the Sobel operator produces either the normal to a vector or the
corresponding gradient vector. It uses two 3 x 3 kernels or masks which are
convolved with the input image to calculate the vertical and horizontal derivative
approximations respectively.
Prewitt Operator: This operator is almost similar to the sobel operator. It also detects
vertical and horizontal edges of an image. It is one of the best ways to detect the
orientation and magnitude of an image. It uses the kernels or masks.
Robert Operator: This gradient-based operator computes the sum of squares of the
differences between diagonally adjacent pixels in an image through discrete
differentiation. Then the gradient approximation is made. It uses the following 2 x
2 kernels or masks
7
derivative of an image. This really works well when the transition of the grey level
seems to be abrupt. It works on the zero-crossing method i.e when the second-
order derivative crosses zero, then that particular location corresponds to a
maximum level. It is called an edge location. Here the Gaussian operator reduces
the noise and the Laplacian operator detects the sharp edges.
8
characteristics. The segmentation process includes restoration, enhancement, and
representation of the image data in the required form.
Image Segmentation Techniques
The next category of image segmentation method is based on the properties of the
images to be segmented. It is categorised as discontinuity detection based approach and
similarity detection based approach. In discontinuity detection based approach, the
segmentation is based on discontinuities in the images like edge based segmentation
and similarity detection based approach is based on similarity of regions like Threshold
based, Region growing, Region Splitting and Merging etc. The segmentation technique
which is based on the information of the structure of required portion of the image is
known as structural segmentation. Most of the segmentation methods are stochastic
type, where the segmentation is completely depended upon the discrete pixel values of
the image.
9
Output of the image is based on this threshold value. Variable Thresholding: In this type
of segmentation method the value of threshold can vary in a single image. Multiple
Thresholding: In this kind of thresholding, the output of segmentation is based on
multiple threshold values. Threshold values can be computed from image histograms. In
[1], threshold based level set approach based on threshold based segmentation and fast
marching method
for medical image segmentation is proposed. To improve the image acquisition process
in computer vision, threshold based segmentation method based on entropy criteria and
genetic algorithm is mentioned in [3].
Edge based segmentation method is based on the sudden change of intensity values in
an image. In image processing, object boundaries are represented using edge. Edge
based segmentation works by identifying the region of abrupt intensity change in an
image [4]. Mainly there are two types of edge based segmentation methods. Grey
Histogram Technique: In this method the foreground is separated from the background
based on a threshold value. Choosing the correct threshold value creates a problem.
Gradient Based Method: Gradient can be defined as the first derivate of the image near
the edge. Higher change in the intensity values between two regions is depicted by the
high value of gradient magnitude. In order to perform multi scale image segmentation
an edge based auto threshold generating method is introduced in. Another method for
edge detection using variance filter is introduced in.