0% found this document useful (0 votes)
33 views10 pages

Assignment-2:DIP: Mr. Victor Mageto CP10101610245

Download as doc, pdf, or txt
Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1/ 10

Assignment-2:DIP

Submitted to: Submitted by:


Mr. Victor Mageto D Vijayendravarma
CP10101610245

1
Q1. Write an essay on three object detection algorithms. Answer:
R-CNN
To bypass the problem of selecting a huge number of regions, Ross Girshick et al.
proposed a method where we use selective search to extract just 2000 regions
from the image and he called them region proposals. Therefore, now, instead of
trying to classify a huge number of regions, you can just work with 2000 regions.
These 2000 region proposals are generated using the selective search algorithm
which is written below.

2
To know more about the selective search algorithm, follow this link. These 2000
candidate region proposals are warped into a square and fed into a convolutional
neural network that produces a 4096- dimensional feature vector as output. The
CNN acts as a feature extractor and the output dense layer consists of the
features extracted from the image and the extracted features are fed into an SVM
to classify the presence of the object within that candidate region proposal. In
addition to predicting the presence of an object within the region proposals, the
algorithm also predicts four values which are offset values to increase the
precision of the bounding box. For example, given a region proposal, the
algorithm would have predicted the presence of a person but the face of that
person within that region proposal could’ve been cut in half. Therefore, the offset
values help in adjusting the bounding box of the region proposal.

YOLO — You Only Look Once


All of the previous object detection algorithms use regions to localize the object
within the image. The network does not look at the complete image. Instead,
parts of the image which have high probabilities of containing the object. YOLO or
You Only Look Once is an object detection algorithm much different from the
region based algorithms seen above. In YOLO a single convolutional network
predicts the bounding boxes and the class probabilities for these boxes.

3
How YOLO works is that we take an image and split it into an SxS grid, within each
of the grid we take m bounding boxes. For each of the bounding box, the network
outputs a class probability and offset values for the bounding box. The bounding
boxes having the class probability above a threshold value is selected and used to
locate the object within the image.
YOLO is orders of magnitude faster(45 frames per second) than other object
detection algorithms. The limitation of YOLO algorithm is that it struggles with
small objects within the image, for example it might have difficulties in detecting a
flock of birds. This is due to the spatial constraints of the algorithm.

Fast R-CNN
The same author of the previous paper(R-CNN) solved some of the drawbacks of
R-CNN to build a faster object detection algorithm and it was called Fast R-CNN.
The approach is similar to the R-CNN
algorithm. But, instead of feeding the region proposals to the CNN, we feed the
input image to the CNN to generate a convolutional feature map. From the
convolutional feature map, we identify the region of proposals and warp them
into squares and by using a RoI pooling layer we reshape them into a fixed size so
that it can be fed into a fully connected layer. From the RoI feature vector, we use
a softmax layer to predict the class of the proposed region and also the offset
values for the bounding box.

The reason “Fast R-CNN” is faster than R-CNN is because you don’t have to feed
2000 region proposals to the convolutional neural network every time. Instead,
the convolution operation is done only once per image and a feature map is
generated from it.

Q2. With the help of appropriate example give the applications of edge
detection.
Answer:

License plate detection


Today, cars are everywhere. Intelligent traffic control will no doubt become the
future trend. So I will briefly discuss how we can apply edge detection in license
plate detection. License plate detection
technology is widely used in tollgates as well as parking lots in public places,
companies, and residential areas. So improving this technology is of great practical
value. First we conducted sample image gray scaling and QDPA operator edge
detection. Below is a comparison of the two images (take this vehicle as an
example)

Detecting hidden information in medical images


What distinguishes our phase-based detection methods from traditional edge
detection methods such as Canny and Sobel is that we not only can detect the
edges of an object; we can also detect some hidden information of the test object.
It is impossible for these details to be detected via traditional methods because
there is very little difference between the colour of these details and the colour of
their surrounding regions. In Boolean system truth value, 1.0 represents absolute
truth value and 0.0 represents absolute false value. But in the fuzzy system, there
is no logic for absolute truth and absolute false value. But in fuzzy logic, there is
intermediate value too present which is partially true and partially false.

Q3. Enumerate on the function of edge operators. Answer:


Edge Detection Operators are of two types:
Gradient – based operator which computes first-order derivations in a digital
image like, Sobel operator, Prewitt operator, Robert operator
Gaussian – based operator which computes second-order derivations in a digital
image like, Canny edge detector, Laplacian of Gaussian

6
Sobel Operator: It is a discrete differentiation operator. It computes the gradient
approximation of image intensity function for image edge detection. At the pixels
of an image, the Sobel operator produces either the normal to a vector or the
corresponding gradient vector. It uses two 3 x 3 kernels or masks which are
convolved with the input image to calculate the vertical and horizontal derivative
approximations respectively.

Prewitt Operator: This operator is almost similar to the sobel operator. It also detects
vertical and horizontal edges of an image. It is one of the best ways to detect the
orientation and magnitude of an image. It uses the kernels or masks.

Robert Operator: This gradient-based operator computes the sum of squares of the
differences between diagonally adjacent pixels in an image through discrete
differentiation. Then the gradient approximation is made. It uses the following 2 x
2 kernels or masks

Marr-Hildreth Operator or Laplacian of Gaussian (LoG): It is a gaussian-based


operator which uses the Laplacian to take the second

7
derivative of an image. This really works well when the transition of the grey level
seems to be abrupt. It works on the zero-crossing method i.e when the second-
order derivative crosses zero, then that particular location corresponds to a
maximum level. It is called an edge location. Here the Gaussian operator reduces
the noise and the Laplacian operator detects the sharp edges.

Canny Operator: It is a gaussian-based operator in detecting edges. This operator


is not susceptible to noise. It extracts image features without affecting or
altering the feature. Canny edge detector have advanced algorithm derived
from the previous work of Laplacian of Gaussian operator. It is widely used an
optimal edge detection technique. It detects edges based on three criteria:
Low error rate
Edge points must be accurately localized
There should be just one single edge response

Q4. Write an essay with relevant examples on image segmentation Answer:


Image segmentation is a very significant area in computer vision. Image segmentation,
partitions an image into multiple regions based on certain similarity constraints. This acts
as the pre-processing stage in several image analysis problems like image compression,
image recognition etc. Segmentation is the vital part for the successful extraction of
image features and classification. Image segmentation can be defined as the partition of
an image into several regions or categories. These regions can be similar in any features
like color, texture, intensity etc. Every pixel in an image is assigned to any one of the
categorised region. Quality of segmentation is described as pixels in the same region are
similar in some characteristics whereas pixels in different regions differ in the

8
characteristics. The segmentation process includes restoration, enhancement, and
representation of the image data in the required form.
Image Segmentation Techniques

Image segmentation techniques can be broadly classified based on certain


characteristics. Basic classifications of image segmentation techniques include local and
global image segmentation techniques. The segmentation method that is concerned
with segmenting specific parts or region of image is known as local image segmentation.
The segmentation method that is concerned with segmenting the whole image,
consisting of very large number of pixels is known as global image segmentation.

The next category of image segmentation method is based on the properties of the
images to be segmented. It is categorised as discontinuity detection based approach and
similarity detection based approach. In discontinuity detection based approach, the
segmentation is based on discontinuities in the images like edge based segmentation
and similarity detection based approach is based on similarity of regions like Threshold
based, Region growing, Region Splitting and Merging etc. The segmentation technique
which is based on the information of the structure of required portion of the image is
known as structural segmentation. Most of the segmentation methods are stochastic
type, where the segmentation is completely depended upon the discrete pixel values of
the image.

Threshold based segmentation method is the simplest method of segmentation. The


image pixels are segmented based on the intensity level. This kind of segmentation is
more applicable for images where the objects are lighter than the background. This
method is based on prior knowledge of the image features. There are mainly three types
of threshold based segmentation. Global Thresholding: This method is done using a
proper threshold value. The threshold value will be constant for the whole image.

9
Output of the image is based on this threshold value. Variable Thresholding: In this type
of segmentation method the value of threshold can vary in a single image. Multiple
Thresholding: In this kind of thresholding, the output of segmentation is based on
multiple threshold values. Threshold values can be computed from image histograms. In
[1], threshold based level set approach based on threshold based segmentation and fast
marching method
for medical image segmentation is proposed. To improve the image acquisition process
in computer vision, threshold based segmentation method based on entropy criteria and
genetic algorithm is mentioned in [3].

Edge based segmentation method is based on the sudden change of intensity values in
an image. In image processing, object boundaries are represented using edge. Edge
based segmentation works by identifying the region of abrupt intensity change in an
image [4]. Mainly there are two types of edge based segmentation methods. Grey
Histogram Technique: In this method the foreground is separated from the background
based on a threshold value. Choosing the correct threshold value creates a problem.
Gradient Based Method: Gradient can be defined as the first derivate of the image near
the edge. Higher change in the intensity values between two regions is depicted by the
high value of gradient magnitude. In order to perform multi scale image segmentation
an edge based auto threshold generating method is introduced in. Another method for
edge detection using variance filter is introduced in.

You might also like