Deep Learning Algorithms For Object Detection
Deep Learning Algorithms For Object Detection
Detecting the objects in an image along with their location, typically using a
bounded box.
1. A Simple Way of Solving an Object Detection
Task using CNN
1. First we take an image as input
2. Then we divide the image into various regions:
3. We will then consider each region as a separate image.
Input Output
Image Image
• Selective Search algorithm takes these over-segments as initial input
and performs the following steps
1. Add all bounding boxes corresponding to segmented parts to the list of
regional proposals
2. Group adjacent segments based on similarity
(Selective Search uses 4 similarity measures based on color, texture, size and
shape compatibility.)
3. Go to step 1
• At each iteration, larger segments are formed and added to the list of
region proposals. Hence we create region proposals from smaller
segments to larger segments in a bottom-up approach. This is what
we mean by computing “hierarchical” segmentations
This image shows the initial, middle and last step of the
hierarchical segmentation process
• All these regions are then warped to have a fixed size as
required by CNN, and each region is passed to the ConvNet
is applied on these
proposals to bring
objectness object it to same size
score
is applied
Determines the
probability of a
proposal having Regresses the
target object coordinates of
For ZF model(an the proposal
ext of Alexnet)
dimension is 256-d
• Anchor boxes are fixed sized boundary boxes that are placed
throughout the image and have different shapes and sizes.
• Then these feature maps are passed to a fully connected layer which
has a softmax and a linear regression layer. It finally classifies the
object and predicts the bounding boxes for the identified objects.
• All of the object detection algorithms we have discussed so far use
regions to identify the objects. The network does not look at the
complete image in one go, but focuses on parts of the image
sequentially. This creates two complications:
• The algorithm requires many passes through a single image to extract all the
objects
• As there are different systems working one after the other, the performance
of the systems further ahead depends on how the previous systems
performed
5. Summary of the Algorithms covered
The network outputs a class probability and offset values for the bounding box
Limitation:
The limitation of YOLO algorithm is that it struggles with small objects
within the image, for example it might have difficulties in detecting a
flock of birds. This is due to the spatial constraints of the algorithm.