0% found this document useful (0 votes)

21 views

Lect-7 Segmentation Localization

Uploaded by

maimoonaziz2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Lect-7 Segmentation Localization

Uploaded by

maimoonaziz2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 151

Deep Learning

CS-878
Week-07
Image Classification

(assume given a set of possible labels)

{dog, cat, truck, plane, ...}

cat

This image by Nikita is

licensed under CC-BY 2.0

Fei-Fei Li, Ehsan Adeli 2

Computer Vision Tasks
Semantic Object Instance
Classification
Segmentation Detection Segmentation

CAT GRASS, CAT, TREE, DOG, DOG, CAT DOG, DOG, CAT
SKY

No spatial extent No objects, just pixels Multiple Object This image is CC0 public domain

Fei-Fei Li, Ehsan Adeli 3

Semantic Segmentation
Semantic Object Instance
Classification
Segmentation Detection Segmentation

CAT GRASS, CAT, TREE, DOG, DOG, CAT DOG, DOG, CAT
SKY

No spatial extent No objects, just pixels Multiple Object

Fei-Fei Li, Ehsan Adeli 4

Semantic Segmentation

Assign a label to each pixel in an image:

• Pixel-level image annotation/analysis (vs. object-level analysis)

Common datasets: PASCAL VOC (2012) and MSCOCO

Semantic Segmentation

▪ A key part of Scene Understanding

▪ Applications
▪ Autonomous navigation
▪ Assisting the partially sighted
▪ Medical diagnosis
▪ Image editing
Semantic Segmentation

▪ Applications
▪ Assisting the partially sighted
Semantic Segmentation

▪ Applications
▪ Medical diagnosis
Semantic Segmentation

▪ Applications
▪ Image editing
Segmentation Tasks

A sample image from the PASCAL VOC2011 dataset

Original (input) Image Semantic (class) Segmentation

Image Source: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/segexamples/index.html

Semantic Segmentation

With semantic segmentation, we are interested

in the precise location of the object.

For example, for a 16x16 image, we would get

256 outputs arranged in 16x16 matrix.

These outputs would tell that which pixel

belongs to which particular class.

For just one object in an image, a pixel could

either belong to the object or the background.
Semantic Segmentation: The Problem

GRASS, CAT, TREE, At test time, classify each pixel of a new image.
SKY, ...
Paired training data:for each training image,Lecture 11 - April 30, 2024
each pixel is labeled with a semantic category.

Fei-Fei Li, Ehsan Adeli 1

Semantic Segmentation Idea: Sliding Window

Full image

Lecture 11 - 13 April 30, 2024

Fei-Fei Li, Ehsan Adeli

Semantic Segmentation Idea: Sliding Window

Full image

Impossible to classify without context

Q: how do we include context?

Fei-Fei Li, Ehsan Adeli

Semantic Segmentation Idea: Sliding Window

Full image

April 30, 2024

Q: how do we model this?

Fei-Fei Li, Ehsan Adeli

Semantic Segmentation

One straight-forward
strategy is to modify our
classification network

Run the classification

network for each pixel in the
image using sliding window

Problems with training and

inference
Semantic Segmentation

Another strategy is to modify our classification network by keeping the

feature map size same throughout the network

Problems in training and inference persist

Fully Convolutional networks (FCN)

J. Long, E. Shelhamer, and T. Darrell, Fully Convolutional Networks for Semantic Segmentation, CVPR 2015
Fully "Convolutional" networks (FCN)
• Use pre-trained networks for classification for segmentation! (VGG, AlexNet, etc.)

• Re-interpret the fully-connected layers as fully convolutional networks.

• Utilize skip-layer concept to improve the segmentation accuracy.

Fully Convolutional networks

Interpret the FC layers as conv layers.

J. Long, E. Shelhamer, and T. Darrell, Fully Convolutional Networks for Semantic Segmentation, CVPR 2015
FCN
FCN

No skip connection 1-skip connection 2-skip connections

11/30/2021
J. Long, E. Shelhamer, and T. Darrell, Fully Convolutional Networks for Semantic Segmentation, CVPR 2015
In-Network upsampling: “Unpooling”

Nearest Neighbor “Bed of Nails”

1 1 2 2 1 0 2 0

1 2 1 1 2 2 1 2 0 0 0 0

3 4 3 3 4 4 3 4 3 0 4 0

3 3 4 4 0 0 0 0

Input:2 x 2 Output:4 x 4 Input:2 x 2 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

In-Network upsampling: “Max Unpooling”
Max Pooling Max Unpooling
Remember which element was max! Use positions from
pooling layer 0 0 2 0
1 2 6 3
1 2
3 5 2 1 5 6
… 3 4
0 1 0 0

1 2 2 1 7 8 0 0 0 0
Rest of the network
7 3 4 8 3 0 0 4

Input:4 x 4 Output:2 x 2 Input:2 x 2 Output:4 x 4

Corresponding pairs of
downsampling and
upsampling layers

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling

Recall:Normal 3 x 3 convolution, stride 1 pad 1

Lecture 11 - 25 April 30, 2024

Input:4 x 4 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling

Recall:Normal 3 x 3 convolution, stride 1 pad 1

Dot product
between filter
and input

Lecture 11 - 26 April 30, 2024

Input:4 x 4 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling

Recall:Normal 3 x 3 convolution, stride 1 pad 1

Dot product
between filter
and input

Lecture 11 - 27 April 30, 2024

Input:4 x 4 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling

Recall:Normal 3 x 3 convolution, stride 2 pad 1

Lecture 11 - 28 April 30, 2024

Input:4 x 4 Output:2 x 2

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling

Recall:Normal 3 x 3 convolution, stride 2 pad 1

Dot product
between filter
and input

Lecture 11 - 29 April 30, 2024

Input:4 x 4 Output:2 x 2

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling

Recall:Normal 3 x 3 convolution, stride 2 pad 1

Filter moves 2 pixels in the

input for every one pixel in
the output
Dot product
between filter Stride gives ratio between
and input movement in input and
output

Lecture 11 - 30 April 30, 2024 We can interpret strided

Input:4 x 4 Output:2 x 2 convolution as “learnable
downsampling”.

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling: Transposed Convolution

3 x 3 transposed convolution, stride 2 pad 1

Lecture 11 - 31 April 30, 2024

Input:2 x 2 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling: Transposed Convolution

3 x 3 transposed convolution, stride 2 pad 1

Input gives
weight for
filter

Lecture 11 - 32
Input:2 x 2 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling: Transposed Convolution

3 x 3 transposed convolution, stride 2 pad 1

Filter moves 2 pixels in the

Input gives output for every one pixel
weight for in the input
filter
Stride gives ratio between
movement in output and
input
Lecture 11 - 33
Input:2 x 2 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling: Transposed Convolution
Sum where
3 x 3 transposed convolution, stride 2 pad 1 output overlaps

Filter moves 2 pixels in the

Input gives output for every one pixel
weight for in the input
filter
Stride gives ratio between
movement in output and
input
Lecture 11 - 34
Input:2 x 2 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling: Transposed Convolution
Sum where
3 x 3 transposed convolution, stride 2 pad 1 output overlaps

Filter moves 2 pixels in the

Input gives output for every one pixel
weight for in the input
filter
Stride gives ratio between
movement in output and
input
Lecture 11 - 35
Input:2 x 2 Output:4 x 4

Fei-Fei Li, Ehsan Adeli

Learnable Upsampling: 1D Example

Output
Input Filter ax Output contains
copies of the filter
weighted by the
x ay input, summing at
a where at overlaps in
the output
y az +bx
b
z by
36
bz

Fei-Fei Li, Ehsan Adeli

Convolution as Matrix Multiplication (1D Example)
We can express convolution in
terms of a matrix multiplication

Lecture 11 - 37 April 30, 2024

Example:1D conv, kernel size=3,
stride=2, padding=1

Fei-Fei Li, Ehsan Adeli

Convolution as Matrix Multiplication (1D Example)
We can express convolution in Transposed convolution multiplies by the
terms of a matrix multiplication transpose of the same matrix:

38
Example:1D conv, kernel size=3, Example:1D transposed conv, kernel size=3,
stride=2, padding=1 stride=2, padding=0

Fei-Fei Li, Ehsan Adeli

Deconvolution Network for Semantic Segmentation

H. Noh, S. Hong, and B. Han, Learning Deconvolution Network for Semantic Segmentation, ICCV 2015
Input image 14 × 14 deconvolutional layer 28 × 28 unpooling layer 28 × 28 deconvolutional layer 56 × 56 unpooling layer

56 × 56 deconvolutional layer 112 × 112 unpooling layer 112 × 112 deconvolutional layer 224 × 224 unpooling layer 224 × 224 deconvolutional layer

Image source: H. Noh, S. Hong, and B. Han, Learning Deconvolution Network for Semantic Segmentation, ICCV 2015
Learned upsampling architectures

Figure source
SegNet

Uses VGG architecture!

Image source: http://mi.eng.cam.ac.uk/projects/segnet/
No FC layer!
V Badrinarayanan, et al., A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling, 2015
U-Net

Source: Olaf Ronneberger, Philipp Fischer, Thomas Brox “U-Net: Convolutional Networks for Biomedical Image Segmentation”, MICCAI, 2015
Object detection
Object Recognition

• Problem: Given an image A, does A contain an image of a person?

YES
Object localization
Human Detection

• UAV images
• Surveillance images
Object detection

• Multiple objects
Why detection?

• Self-driving car
A simple solution
Sliding window
Sliding window

• Slide a box around image and classify

each crop

• There are many interesting papers

• Viola and Jones (2001): Face detector
(22K citations)
• Dalal and Triggs (2005): HOG (33K
citations)
Naïve approach: Template Matching

Find the chair in this image Output of correlation

This is a chair
Template Matching

Find the chair in this image

Epic fail!
Simple template matching is not going to make it
Sliding window

• General approach
• Scan all possible locations
• Extract features
• Classify features
• Post-processing
Evaluation

• Intersection over union

Sample IoU
Evaluation

• True positives
Evaluation

• True positives
• False positives
Evaluation

• True positives
• False positives
• False negatives
Evaluation

• Only one is correct

Evaluation

• Precision
• Precision is the ability of a model to identify only the relevant objects.
• It is the percentage of correct positive predictions and is given by:

• Recall
• Recall is the ability of a model to find all the relevant cases
• It is the percentage of true positive detected among all relevant ground truths
and is given by:
Evaluation

• Sort all predicted boxes (for all images)

• According to scores
• For each k (location) in the list
• Compute recall and precision

https://github.com/rafaelpadilla/Object-Detection-Metrics
Average precision (AP)

mAP: average AP over multiple classes

Simple Recipe for Object Detection

Step 1: Train (or download) a classification model (AlexNet, VGG, GoogLeNet)

Convolution
and Pooling Fully-connected
layers

Softmax loss

Final conv
feature map Class scores

Image

Problem: Is there a turtle in this picture? If yes, localize!

Simple Recipe for Object Detection

Step 2: Attach new fully-connected “regression head” to the network.

Fully-connected
layers
“Classification head”

Convolution Class scores

and Pooling

Fully-connected
layers
“Regression head”

Final conv Box

feature map coordinates

Image
Simple Recipe for Object Detection

Step 3: Train the regression head only using L2 loss

Fully-connected
layers

Convolution Class scores

and Pooling

Fully-connected
layers

L2 loss
Final conv
feature map Box coordinates

Image
Simple Recipe for Object Detection

Step 4: At test time use both heads.

Fully-connected
layers

Convolution Class scores

and Pooling

Fully-connected
layers

Final conv
feature map Box coordinates

Image
Simple Recipe for Object Detection
Correct label:
Cat
ObjectDetection:SingleObject
Class Scores
(Classification +Localization) Fully Cat:0.9 Softmax
Connected: Dog:0.05 Loss
4096 to 1000
Car:0.01
x, y ...

h Multitask Loss + Loss

w
Vector: Fully
This image is CC0 public domain
Connected:
4096 4096 to 4 Box
Lecture 11 - 74 Coordinates L2 Loss
(x, y, w, h)
Treat localization as a
regression problem! Correct box:
(x’, y’, w’, h’)
Detection as Regression?

DOG, (x, y, w, h)
CAT, (x, y, w, h)
CAT, (x, y, w, h)
DUCK (x, y, w, h)

= 16 numbers
Detection as Regression?

DOG, (x, y, w, h)
CAT, (x, y, w, h)

= 8 numbers
Detection as Regression?

CAT, (x, y, w, h)
CAT, (x, y, w, h)
….
CAT (x, y, w, h)

= many numbers

Need variable sized outputs

Detection as Classification

CAT? NO

DOG? NO
Detection as Classification

CAT? YES!

DOG? NO
Detection as Classification

CAT? NO

DOG? Yes
Detection as Classification

Problem:
• Need to test many positions and scales
• Use a computationally demanding classifier (CNN)
• Search at different scales
• Search at different positions

Solution: Only look at a tiny subset of possible positions

Region Proposals

● Find “blobby” image regions that are likely to contain objects

● “Class-agnostic” object detector
● Look for “blob-like” regions
Region Proposals: Selective Search
Bottom-up segmentation, merging regions at multiple scales

Convert
regions
to boxes

Uijlings et al, “Selective Search for Object Recognition”, IJCV 2013

R-CNN

Girshick et al, “Rich feature hierarchies for accurate object detection and semantic
segmentation”, CVPR 2014.
Input image Figure copyright Ross Girshick, 2015; source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

R-CNN

Regions of Interest
(RoI) from a proposal
85
method (~2k)
Girshick et al, “Rich feature hierarchies for accurate object detection and semantic
segmentation”, CVPR 2014.
Input image Figure copyright Ross Girshick, 2015; source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

R-CNN

Warped image
regions (224x224
pixels)
Regionsof
Interest (RoI)
fromaproposal
method(~2k)

Fei-Fei Li, Ehsan Adeli Lecture 11 - 54 April 30, 2024

R-CNN

Conv Forward each region

Conv Net through ConvNet
Net (ImageNet-pretranied)
Conv
Warped image
Net
regions (224x224
pixels)
Regionsof
Interest (RoI)
fromaproposal
method(~2k)
Girshick et al, “Rich feature hierarchies for accurate object detection and semantic
Input image segmentation”, CVPR 2014.
Figure copyright Ross Girshick, 2015; source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

Lecture 11 - 55 April 30, 2024
R-CNN
SVMs Classify regions with
SVMs SVMs

SVMs
Conv Forward each region
Conv Net through ConvNet
Net (ImageNet-pretranied)
Conv
Warped image
Net
regions (224x224
pixels)
Regionsof
Interest (RoI)
fromaproposal
method(~2k)
Girshick et al, “Rich feature hierarchies for accurate object detection and semantic
Input image segmentation”, CVPR 2014.
Figure copyright Ross Girshick, 2015; source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

Lecture 11 - 56 April 30, 2024
R-CNN
Bbox reg SVMs Predict “corrections” to the RoI:
SVMs
4 numbers: (dx, dy, dw, dh)
Bbox reg
Classify regions with
Bbox reg SVMs SVMs
Conv
Forward each region
Conv Net
through ConvNet
Net
Conv (ImageNet-pretranied)
Net
Warped image
regions (224x224
pixels)
Regionsof
Interest (RoI)
fromaproposal
method(~2k)
Input image Girshick et al, “Rich feature hierarchies for accurate object detection and semantic
segmentation”, CVPR 2014.
Figure copyright Ross Girshick, 2015; source. Reproduced with permission.
Fei-Fei Li, Ehsan Adeli
R-CNN details

• Regions: uses ~2000 Selective Search proposals

• Network: uses AlexNet pre-trained on ImageNet (1000 classes), fine-
tuned on PASCAL (21 classes)
• Final detector:
• first warp proposal regions,
• then extract fc7 network activations (4096 dimensions),
• Finally, classify with linear SVM
• Bounding box regression is also used to refine box locations
R. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014.
R-CNN Training (initialization)

Step 1: Train (or download) a classification model for ImageNet (AlexNet)

Convolution
and Pooling Fully-connected
layers

Softmax loss

Final conv
Class scores
feature map
Image 1000 classes
R-CNN Training (Fine-tuning)

Step 2: "Fine-tune" model for detection

- Instead of 1000 ImageNet classes, want 20 object classes + background
- Throw away final fully-connected layer, reinitialize from scratch
- Keep training model using positive / negative regions from detection images

Re-initialize this layer:

Convolution was 4096 x 1000,
and Pooling Fully-connected now will be 4096 x 21
layers

Softmax loss

Final conv
Class scores:
feature map
Image 21 classes
R-CNN Training (feature extraction)

Step 3: Extract features

- Extract region proposals for all images
- For each region: warp to CNN input size, run forward through CNN, save pool5
features to disk
- Have a big hard drive: features are ~200GB for PASCAL dataset!

Convolution
and Pooling

pool5 features

Image Region Proposals Crop + Warp Forward pass Save to disk

R-CNN Training (train classifier)

Step 4: Train one binary SVM per class to classify region features

Training image regions

Cached region features

Positive samples for cat SVM Negative samples for cat SVM
R-CNN Training (train classifier)

Step 4: Train one binary SVM per class to classify region features

Training image regions

Cached region features

Negative samples for dog SVM Positive samples for dog SVM
R-CNN Training (bounding box regression/prediction)

Step 5 (bbox regression): For each class, train a linear regression model to map from cached
features to offsets to GT boxes to make up for “slightly wrong” proposals

Training image regions:

Cached region features

Regression targets: (0, 0, 0, 0) (.25, 0, 0, 0) (0, 0, -0.125, 0)

(dx, dy, dw, dh) Proposal is good Proposal too Proposal too
Normalized coordinates far to left wide
Issue #1 with R-CNN

• Slow in run-time
• Multiple forward passes for each proposal
• There are thousands of proposals

• Solution
• Single forward pass for each image?
Issue #2 with R-CNN

• Separate classifier training

• CNN feature extractor is not trained with classifier and regressor

• Solution
• End-to-end training?
Issue #3 with R-CNN

• Complex training pipeline

• Proposals
• Feature extraction
• Classification

• Solution
• Single forward pass for each image?
Solution

• Fast R-CNN
• Single forward pass for each image
• No separate classifier
• End-to-end training
Fast R-CNN

“Slow” R-CNN

10
1
Input image

Girshick, “Fast R-CNN”, ICCV 2015. Figure copyright Ross Girshick, 2015;source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

Fast R-CNN

“Slow” R-CNN

“conv5” features
Run whole image
through ConvNet
“Backbone”
network:
AlexNet, VGG, ConvNet
ResNet, etc
Input image

Girshick, “Fast R-CNN”, ICCV 2015. Figure copyright Ross Girshick, 2015;source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

Fast R-CNN

Regions of “Slow” R-CNN

Interest (RoIs)
from a proposal
method “conv5” features
Run whole image
through ConvNet
“Backbone”
network:
AlexNet, VGG, ConvNet
ResNet, etc
Input image

Girshick, “Fast R-CNN”, ICCV 2015. Figure copyright Ross Girshick, 2015;source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

Fast R-CNN

Regions of “Slow” R-CNN

Interest (RoIs)
Crop +Resize features
from a proposal
method “conv5” features

Run whole image

“Backbone” through ConvNet
network:
AlexNet, VGG, ConvNet
ResNet, etc
Input image

Girshick, “Fast R-CNN”, ICCV 2015. Figure copyright Ross Girshick, 2015;source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

Fast R-CNN
Object Linear +
softmax Linear Box offset
category

Regions of CNN Per-Region Network “Slow” R-CNN

Interest (RoIs)
Crop +Resize features
from a proposal
method “conv5” features

Run whole image

“Backbone” through ConvNet
network:
AlexNet, VGG, ConvNet
ResNet, etc
Input image

Girshick, “Fast R-CNN”, ICCV 2015. Figure copyright Ross Girshick, 2015;source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

Fast R-CNN
Object Linear +
softmax Linear Box offset
category

Regions of CNN Per-Region Network “Slow” R-CNN

Interest (RoIs)
Crop +Resize features
from a proposal
method “conv5” features

Run whole image

“Backbone” through ConvNet
network:
AlexNet, VGG, ConvNet
ResNet, etc
Input image

Girshick, “Fast R-CNN”, ICCV 2015. Figure copyright Ross Girshick, 2015;source. Reproduced with permission.

Fei-Fei Li, Ehsan Adeli

Fast R-CNN: Another view

R. Girshick, Fast R-CNN, ICCV 2015

Cropping Features: RoI Pool

CNN

Lecture 11 - April 30, 2024

Input Image Image features:C x H x W
(e.g. 3 x 640 x 480) (e.g. 512 x 20 x 15)

Girshick, “Fast R-CNN”, ICCV 2015.

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Pool
Project proposal
onto features

CNN

Input Image Lecture 11 - April 30, 2024

Image features:C x H x W
(e.g. 3 x 640 x 480) (e.g. 512 x 20 x 15)

Girshick, “Fast R-CNN”, ICCV 2015.

Fei-Fei Li, Ehsan Adeli 67

Cropping Features: RoI Pool
“Snap” to
Project proposal grid cells
onto features

CNN

Input Image Lecture 11 - 11 April 30, 2024

Image features:C x H x W
(e.g. 3 x 640 x 480) 0
(e.g. 512 x 20 x 15)

Girshick, “Fast R-CNN”, ICCV 2015.

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Pool
“Snap” to
Project proposal grid cells
onto features

Q: how do we resize the 512 x 5

x 4 region to, e.g., a 512 x 2 x 2
tensor?.
CNN

Input Image Lecture 11 - 11 April 30, 2024

Image features:C x H x W
(e.g. 3 x 640 x 480) 1
(e.g. 512 x 20 x 15)

Girshick, “Fast R-CNN”, ICCV 2015.

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Pool
“Snap” to Divide into 2x2
Project proposal grid cells grid of (roughly)
onto features equal subregions

Q: how do we resize the 512 x 5

x 4 region to, e.g., a 512 x 2 x 2
tensor?.
CNN

Input Image Lecture 11 - 11 April 30, 2024

Image features:C x H x W
(e.g. 3 x 640 x 480) 2
(e.g. 512 x 20 x 15)

Girshick, “Fast R-CNN”, ICCV 2015.

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Pool
“Snap” to Divide into 2x2
grid cells grid of (roughly)
Project proposal
equal subregions
onto features
Max-pool within
each subregion

CNN

Region features
(here 512 x 2 x 2;
In practice e.g 512 x 7 x 7)
Input Image Lecture 11 - 11 April 30, 2024
Image features:C x H x W Region features always the
(e.g. 3 x 640 x 480) 3
(e.g. 512 x 20 x 15) same size even if input regions
have different sizes!
Girshick, “Fast R-CNN”, ICCV 2015.

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Pool
“Snap” to Divide into 2x2
Project proposal grid cells grid of (roughly)
onto features equal subregions
Max-pool within
each subregion

CNN

Region features
(here 512 x 2 x 2;
In practice e.g 512 x 7 x 7)
Input Image Lecture 11 - 11 April 30, 2024
Image features:C x H x W Region features always the
(e.g. 3 x 640 x 480) 4
(e.g. 512 x 20 x 15) same size even if input regions
have different sizes!
Girshick, “Fast R-CNN”, ICCV 2015.
Problem: Region features slightly misaligned

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Align
No “snapping”!
Project proposal
onto features

CNN

Input Image Lecture 11 - 11 April 30, 2024

Image features:C x H x W
(e.g. 3 x 640 x 480) 5
(e.g. 512 x 20 x 15)

He et al, “Mask R-CNN”, ICCV 2017

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Align
Sample at regular points in
No “snapping”! each subregion using
Project proposal
onto features bilinear interpolation

CNN

Input Image Lecture 11 - 11 April 30, 2024

Image features:C x H x W
(e.g. 3 x 640 x 480) 6
(e.g. 512 x 20 x 15)

He et al, “Mask R-CNN”, ICCV 2017

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Align
Sample at regular points in each
No “snapping”! subregion using bilinear interpolation
Project proposal
onto features

CNN

Feature fxy for point (x, y) is

a linear combination of
Input Image Lecture 11 - 11
Image features:C x H x W features at its four
(e.g. 3 x 640 x 480) 7
(e.g. 512 x 20 x 15) neighboring grid cells:

He et al, “Mask R-CNN”, ICCV 2017

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Align
Sample at regular points in each
No “snapping”! subregion using bilinear interpolation
Project proposal
onto features f11∈R512 f21∈R512
(x1,y1) (x2,y1)
(x,y)
f12∈R512 f22∈R512
CNN
(x1,y2) (x2,y2)

Feature fxy for point (x, y) is

a linear combination of
Input Image Image features:C x H x W features at its four
(e.g. 3 x 640 x 480) (e.g. 512 x 20 x 15) neighboring grid cells:

He et al, “Mask R-CNN”, ICCV 2017

Fei-Fei Li, Ehsan Adeli

Cropping Features: RoI Align
Sample at regular points in each
No “snapping”! subregion using bilinear interpolation
Project proposal
onto features
Max-pool within
each subregion

CNN

Region features
(here 512 x 2 x 2;
In practice e.g 512 x 7 x 7)
Input Image Lecture 11 - 11 April 30, 2024
Image features:C x H x W
(e.g. 3 x 640 x 480) 9
(e.g. 512 x 20 x 15)

He et al, “Mask R-CNN”, ICCV 2017

Fei-Fei Li, Ehsan Adeli

R-CNN vs Fast R-CNN

Lecture 11 - 12 April 30, 2024

0
Girshick et al, “Rich feature hierarchies for accurate object detection and semantic segmentation”, CVPR 2014.
He et al, “Spatial pyramid pooling in deep convolutional networks for visual recognition”, ECCV 2014
Girshick, “Fast R-CNN”, ICCV 2015

Fei-Fei Li, Ehsan Adeli

R-CNN vs Fast R-CNN

Problem:
Runtime dominated by
Lecture 11 - 12 April 30, 2024 region proposals!
1
Girshick et al, “Rich feature hierarchies for accurate object detection and semantic segmentation”, CVPR 2014.
He et al, “Spatial pyramid pooling in deep convolutional networks for visual recognition”, ECCV 2014
Girshick, “Fast R-CNN”, ICCV 2015

Fei-Fei Li, Ehsan Adeli

Faster R-CNN: Make CNN do proposals!

Insert Region Proposal

Network (RPN) to predict
proposals from features

Otherwise same as Fast R-CNN: Crop

features for each proposal, classify
each one Lecture 11 - 12
2

Ren et al, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, NIPS 2015
Figure copyright 2015, Ross Girshick; reproduced with permission

Fei-Fei Li, Ehsan Adeli

Region Proposal Network

CNN

Input Image Lecture 11 - 12 April 30, 2024

(e.g. 3 x 640 x 480) 3
Image features
(e.g. 512 x 20 x 15)

Fei-Fei Li, Ehsan Adeli

Region Proposal Network
Imagine an anchor box of
fixed size at each point in
the feature map

CNN

Input Image Lecture 11 -

(e.g. 3 x 640 x 480) Image features
(e.g. 512 x 20 x 15)

Fei-Fei Li, Ehsan Adeli

Region Proposal Network
Imagine an anchor box of
fixed size at each point in
the feature map

Anchor is an object?
1 x 20 x 15
CNN Conv

Lecture 11 - At each point, predict whether

Input Image
(e.g. 3 x 640 x 480)
the corresponding anchor
Image features
(e.g. 512 x 20 x 15)
contains an object (binary
classification)

Fei-Fei Li, Ehsan Adeli

Region Proposal Network
Imagine an anchor box of
fixed size at each point in
the feature map

Anchor is an object?
1 x 20 x 15
CNN Conv
Box corrections
4 x 20 x 15

Lecture 11 - For positive boxes, also predict a

Input Image
(e.g. 3 x 640 x 480)
corrections from the anchor to
Image features
(e.g. 512 x 20 x 15) the ground-truth box (regress 4
numbers per pixel)

Fei-Fei Li, Ehsan Adeli

Region Proposal Network
In practice use K different
anchor boxes of different
size/scale at each point

Anchor is an object?
K x 20 x 15
CNN Conv
Box transforms
4K x 20 x 15

Input Image April 30, 2024

(e.g. 3 x 640 x 480) Image features
(e.g. 512 x 20 x 15)

Fei-Fei Li, Ehsan Adeli

Region Proposal Network
In practice use K different
anchor boxes of different
size/scale at each point

Anchor is an object?
K x 20 x 15
CNN Conv
Box transforms
4K x 20 x 15

Sort the K2015 boxes by

Input Image their “objectness” score, take
(e.g. 3 x 640 x 480) Image features top ~300 as our proposals
(e.g. 512 x 20 x 15)

Fei-Fei Li, Ehsan Adeli

Region proposal network (RPN)
• Slide a small window over the conv5 layer
• Predict object/no object
• Regress bounding box coordinates
• Box regression is with reference to anchors (3 scales x 3 aspect ratios)
Classification head Regression head

Source:
Anchor boxes
Region proposal network (RPN)
Faster R-CNN: Make CNN do proposals!

Jointly train with 4 losses:

1. RPN classify object / not object
2. RPN regress box coordinates
3. Final classification score (object
classes)
4. Final box coordinates

Lecture 11 - 13 April 30, 2024

Ren et al, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, NIPS 2015
Figure copyright 2015, Ross Girshick; reproduced with permission

Fei-Fei Li, Ehsan Adeli

Faster R-CNN: Make CNN do proposals!

Lecture 11 - 13 April 30, 2024

Fei-Fei Li, Ehsan Adeli

Faster R-CNN: Make CNN do proposals!
Glossing over many details:
- Ignore overlapping proposals with
non-max suppression
- How are anchors determined?
- How do we sample positive /
negative samples for training the
RPN?
- How to parameterize bounding box
regression?
Lecture 11 - 13 April 30, 2024
4

Ren et al, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, NIPS 2015
Figure copyright 2015, Ross Girshick; reproduced with permission

Fei-Fei Li, Ehsan Adeli

Faster R-CNN: Make CNN do proposals!

Faster R-CNN is a
Two-stage object detector

First stage:Run once per image

- Backbone network
- Region proposal network

Second stage:Run once per region

- Crop features:RoI pool / align
- Predict object class
- Prediction bbox offset

Fei-Fei Li, Ehsan Adeli

R-CNN Summary:

R-CNN: Propose regions first. Classify proposed regions one at a

time. Output contains: label + bounding box.

Fast R-CNN: Propose regions after the convolutional net. Use

convolution implementation of sliding windows to classify
all the proposed regions. End-to-end.

Faster R-CNN: Use ConvNet to propose regions. End-to-end.

[Girshik et. al, 2013. Rich feature hierarchies for accurate object detection and semantic segmentation]
[Girshik, 2015. Fast R-CNN]
[Ren et. al, 2016. Faster R-CNN: Towards real-time object detection with region proposal networks]
Faster R-CNN: Make CNN do proposals!
Do we really need
the second stage?
Faster R-CNN is a
Two-stage object detector

First stage:Run once per image

- Backbone network
- Region proposal network

Second stage:Run once per region

- Crop features:RoI pool / align
- Predict object class
- Prediction bbox offset

Fei-Fei Li, Ehsan Adeli

Single-Stage Object Detectors: YOLO / SSD / RetinaNet

Within each grid cell:

- Regress from each of the B
base boxes to a final box with
5 numbers:
(dx, dy, dh, dw, confidence)
- Predict scores for each of C
classes (including
background as a class)
- Looks a lot like RPN, but
Input image
category-specific!
Divide image into grid
3xHxW 7x7
Image a set of base boxes
Output:
Redmon et al, “You Only Look Once:
Unified, Real-Time Object Detection”, CVPR 2016 centered at each grid cell 7 x 7 x (5 *B +C)
Liu et al, “SSD: Single-Shot MultiBox Detector”, ECCV 2016
Lin et al, “Focal Loss for Dense Object Detection”, ICCV 2017 Here B =3

Fei-Fei Li, Ehsan Adeli

YOLO– real-time object detection

April 30, 2024

Redmon et al. "You only look once: unified, real-time object detection (2015)."

Fei-Fei Li, Ehsan Adeli

Object Detection: Lots of variables ...
Backbone “Meta-Architecture” Takeaways
Network Two-stage:Faster R-CNN Faster R-CNN is slower but
VGG16 Single-stage:YOLO / SSD more accurate
ResNet-101 Hybrid: R-FCN
Inception V2 SSD is much faster but not
Inception V3 Image Size as accurate
Inception ResNet #Region Proposals
MobileNet … Bigger / Deeper backbones
work better
Huang et al, “Speed/accuracy trade-offs for modern convolutional
Lecture 11 - 14 object detectors”, CVPR 2017
April 30, 2024
Zou et al, “Object Detection in 20 Years: A Survey”, arXiv 2019 0
R-FCN: Dai et al, “R-FCN: Object Detection via Region-based Fully Convolutional Networks”, NIPS 2016
Inception-V2: Ioffe and Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”, ICML 2015
Inception V3:Szegedy et al, “Rethinking the Inception Architecture for Computer Vision”, arXiv 2016
Inception ResNet: Szegedy et al, “Inception-V4, Inception-ResNet and the Impact of Residual Connections on Learning”, arXiv 2016
MobileNet: Howard et al, “Efficient Convolutional Neural Networks for Mobile Vision Applications”, arXiv 2017

Fei-Fei Li, Ehsan Adeli

Instance Segmentation
Semantic Object Instance
Classification
Segmentation Detection Segmentation

CAT GRASS, CAT, TREE, 14 DOG, DOG, CAT DOG, DOG, CAT
SKY 1

No spatial extent No objects, just pixels Multiple Object

Fei-Fei Li, Ehsan Adeli

Object Detection: Faster R-CNN

Fei-Fei Li, Ehsan Adeli

Instance Segmentation: Mask R-CNN
Mask Prediction

Add a small mask

network that operates
on each RoIand
predicts a 28x28 binary
mask

Fei-Fei
He et al, “MaskLi, Ehsan
R-CNN”, ICCV 2017 Adeli
Mask R-CNN
Classification Scores: C
Box coordinates (per class):4 *C

CNN Conv Conv

+RPN RoI Align

256 x 14 x 14 256 x 14 x 14 Predict a mask for

each of C classes

C x 28 x 28
He et al, “Mask R-CNN”, arXiv 2017

Fei-Fei Li, Ehsan Adeli 144

Mask R-CNN: Example Mask Training Targets

Fei-Fei Li, Ehsan Adeli 145

Mask R-CNN: Example Mask Training Targets

Fei-Fei Li, Ehsan Adeli 146

Mask R-CNN: Example Mask Training Targets

Fei-Fei Li, Ehsan Adeli 147

Mask R-CNN: Example Mask Training Targets

Fei-Fei Li, Ehsan Adeli 148

Mask R-CNN: Very Good Results!

He et al, “Mask R-CNN”, ICCV 2017

Fei-Fei Li, Ehsan Adeli 149

Mask R-CNN Also does pose

He et al, “Mask R-CNN”, ICCV 2017

Fei-Fei Li, Ehsan Adeli 150

Open Source Frameworks

Lots of good implementations on GitHub!

TensorFlow Detection API:

https://github.com/tensorflow/models/tree/master/research/object_detection
Faster RCNN, SSD, RFCN, Mask R-CNN, ...

Detectron2 (PyTorch)
https://github.com/facebookresearch/detectron2
Mask R-CNN, RetinaNet, Faster R-CNN, RPN, Fast R-CNN, R-FCN, ...
Lecture 11 - April 30, 2024
Finetune on your own dataset with pre-trained models

Fei-Fei Li, Ehsan Adeli 151

em lab11
No ratings yet
em lab11
6 pages
Segmentation Detection
100% (1)
Segmentation Detection
109 pages
Object Detyection Using CNN
No ratings yet
Object Detyection Using CNN
113 pages
12. Object Detection-compressed
No ratings yet
12. Object Detection-compressed
80 pages
Lecture 5 - CNNs For Detection and Segmentation
No ratings yet
Lecture 5 - CNNs For Detection and Segmentation
62 pages
8-Image Detection and Segmentation
No ratings yet
8-Image Detection and Segmentation
73 pages
Lecture07 VDL Part01
No ratings yet
Lecture07 VDL Part01
90 pages
Dlcv2017d3l1segmentation 170623173102
No ratings yet
Dlcv2017d3l1segmentation 170623173102
36 pages
5. Object Detection and Segmentation - part 2
No ratings yet
5. Object Detection and Segmentation - part 2
36 pages
05 CNN 2
No ratings yet
05 CNN 2
92 pages
lecture4
No ratings yet
lecture4
46 pages
Harley MSC Thesis Menos Especializadpo
No ratings yet
Harley MSC Thesis Menos Especializadpo
71 pages
02 Semantic Segmentation 2024
No ratings yet
02 Semantic Segmentation 2024
53 pages
Deconvolution Network ICCV 2015 Paper PDF
No ratings yet
Deconvolution Network ICCV 2015 Paper PDF
9 pages
Fully Convolutional Networks For Semantic Segmentation: Jonathan Long Evan Shelhamer Trevor Darrell UC Berkeley
No ratings yet
Fully Convolutional Networks For Semantic Segmentation: Jonathan Long Evan Shelhamer Trevor Darrell UC Berkeley
10 pages
03 Convolutional Neural Networks
No ratings yet
03 Convolutional Neural Networks
83 pages
14 Segmentation
No ratings yet
14 Segmentation
22 pages
Fully Convolutional Networks For Semantic Segmentation
No ratings yet
Fully Convolutional Networks For Semantic Segmentation
12 pages
Fully_Convolutional_Networks_for_Semantic_Segmentation
No ratings yet
Fully_Convolutional_Networks_for_Semantic_Segmentation
12 pages
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
No ratings yet
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
55 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
Lecture4 - Convnets For CV Slide
No ratings yet
Lecture4 - Convnets For CV Slide
65 pages
Week 11 - Convolutional
No ratings yet
Week 11 - Convolutional
78 pages
Convolutional Neural Networks - Part 1
No ratings yet
Convolutional Neural Networks - Part 1
44 pages
Image Recognition Using Neural Networks
No ratings yet
Image Recognition Using Neural Networks
18 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
CNN
No ratings yet
CNN
10 pages
[Fall 2024] Images and Convolutions
No ratings yet
[Fall 2024] Images and Convolutions
69 pages
Lecture-21-Semantic-Segmentation
No ratings yet
Lecture-21-Semantic-Segmentation
24 pages
AE556_2024_Topic4_CNN
No ratings yet
AE556_2024_Topic4_CNN
26 pages
CNN 2
No ratings yet
CNN 2
47 pages
Fully Convolutional Networks For Semantic Segmentation
No ratings yet
Fully Convolutional Networks For Semantic Segmentation
17 pages
DL6 - Convnets 4
No ratings yet
DL6 - Convnets 4
57 pages
Overview of semantic segmentation
No ratings yet
Overview of semantic segmentation
20 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
Deep Learning: Alberto Ezpondaburu
No ratings yet
Deep Learning: Alberto Ezpondaburu
58 pages
4
No ratings yet
4
5 pages
Lecture2.2 UnimodalRepresentations Part1 PDF
No ratings yet
Lecture2.2 UnimodalRepresentations Part1 PDF
92 pages
CVlecture 5
No ratings yet
CVlecture 5
56 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
An Analysis of Convolutional Neural Network Architectures
No ratings yet
An Analysis of Convolutional Neural Network Architectures
54 pages
REF-6-DeepLab_Semantic_Image_Segmentation_with_Deep_Convolutional_Nets_Atrous_Convolution_and_Fully_Connected_CRFs
No ratings yet
REF-6-DeepLab_Semantic_Image_Segmentation_with_Deep_Convolutional_Nets_Atrous_Convolution_and_Fully_Connected_CRFs
15 pages
Classify Webcam Images Using Deep Learning
No ratings yet
Classify Webcam Images Using Deep Learning
17 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
39 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
6S191 MIT DeepLearning L3
No ratings yet
6S191 MIT DeepLearning L3
70 pages
Image Segmentation Keras: Implementation of Segnet, FCN, Unet, Pspnet and Other Models in Keras
No ratings yet
Image Segmentation Keras: Implementation of Segnet, FCN, Unet, Pspnet and Other Models in Keras
5 pages
CS436_CS5310_EE513_L05_CNN2
No ratings yet
CS436_CS5310_EE513_L05_CNN2
27 pages
CNN AI
No ratings yet
CNN AI
17 pages
Identify Web Cam Images Using Neural Networks
No ratings yet
Identify Web Cam Images Using Neural Networks
17 pages
L3 - UUCLxDeepMind DL2020
No ratings yet
L3 - UUCLxDeepMind DL2020
110 pages
Semantic Segmentation by Using Down-Sampling and S
No ratings yet
Semantic Segmentation by Using Down-Sampling and S
14 pages
Convolutional Neural Networks: Computer Vision CS 543 / ECE 549 University of Illinois Jia-Bin Huang
No ratings yet
Convolutional Neural Networks: Computer Vision CS 543 / ECE 549 University of Illinois Jia-Bin Huang
76 pages
2 Convolutional Neural Network For Image Classification
No ratings yet
2 Convolutional Neural Network For Image Classification
6 pages
Convolutional_Networks_2024
No ratings yet
Convolutional_Networks_2024
44 pages
On Convolutional Neural Network: Zeng Huang
No ratings yet
On Convolutional Neural Network: Zeng Huang
19 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Lecture 05 - DC Motors
No ratings yet
Lecture 05 - DC Motors
105 pages
Controls Lab1
No ratings yet
Controls Lab1
5 pages
Lecture 9
No ratings yet
Lecture 9
9 pages
Lec - 05 - CNN Deep Learning
No ratings yet
Lec - 05 - CNN Deep Learning
176 pages
CEP Control Sytems
No ratings yet
CEP Control Sytems
3 pages
RFLAB2
No ratings yet
RFLAB2
8 pages
Lecture 8
No ratings yet
Lecture 8
13 pages
Chapter 5 Emt
No ratings yet
Chapter 5 Emt
4 pages
Rflab 6
No ratings yet
Rflab 6
5 pages
Lab 4 RF
No ratings yet
Lab 4 RF
3 pages
RFLAB3
No ratings yet
RFLAB3
4 pages
Fa19-Eee-017 RF Lab 8
No ratings yet
Fa19-Eee-017 RF Lab 8
4 pages
Automated Plant Disease Analysis (APDA) Performance Comparison of Machine
No ratings yet
Automated Plant Disease Analysis (APDA) Performance Comparison of Machine
6 pages
Image Processing-Introduction-Bryan-Mac-Namee PDF
No ratings yet
Image Processing-Introduction-Bryan-Mac-Namee PDF
34 pages
Untitled
No ratings yet
Untitled
158 pages
Hybrid object detection and distance measurement for precision agriculture: integrating YOLOv8 with rice field sidewalk detection algorithm
No ratings yet
Hybrid object detection and distance measurement for precision agriculture: integrating YOLOv8 with rice field sidewalk detection algorithm
11 pages
Detection of Power Line Insulator Defects Using Aerial Images Analyzed With Convolutional Neural Networks
No ratings yet
Detection of Power Line Insulator Defects Using Aerial Images Analyzed With Convolutional Neural Networks
13 pages
Digital Image Processing Unit 1
No ratings yet
Digital Image Processing Unit 1
42 pages
Mini Project Report Final Last
No ratings yet
Mini Project Report Final Last
43 pages
SWE1010 - Digital Image Processing With SLO
No ratings yet
SWE1010 - Digital Image Processing With SLO
2 pages
The Intelligent Vehicle Number Plate Recognition System Based On Arduino
No ratings yet
The Intelligent Vehicle Number Plate Recognition System Based On Arduino
19 pages
Seminar (Read)
No ratings yet
Seminar (Read)
32 pages
Digital Image Processing and Pattern Recognition
67% (3)
Digital Image Processing and Pattern Recognition
10 pages
Topics in Signal Processing Volume 10 - Computer Vision Analysis of Image Motion by Variational Methods (2014) (Amar Mitiche, J.K. Aggarwal)
100% (1)
Topics in Signal Processing Volume 10 - Computer Vision Analysis of Image Motion by Variational Methods (2014) (Amar Mitiche, J.K. Aggarwal)
212 pages
Digital Image Processing
No ratings yet
Digital Image Processing
71 pages
Paper Avizo PSD
No ratings yet
Paper Avizo PSD
14 pages
Lung Cancer PDF
No ratings yet
Lung Cancer PDF
8 pages
09.23.mathematical Models For Local Deter Minis Tic in Paintings
No ratings yet
09.23.mathematical Models For Local Deter Minis Tic in Paintings
31 pages
Advertisement Detection, Segmentation, and Classification For Newspaper Images and Website Snapshots
No ratings yet
Advertisement Detection, Segmentation, and Classification For Newspaper Images and Website Snapshots
6 pages
Krishna Kant Singh, Akansha Singh
100% (1)
Krishna Kant Singh, Akansha Singh
4 pages
Real-Time Indian Sign Language (ISL) Recognition: Kartik Shenoy, Tejas Dastane, Varun Rao, Devendra Vyavaharkar
No ratings yet
Real-Time Indian Sign Language (ISL) Recognition: Kartik Shenoy, Tejas Dastane, Varun Rao, Devendra Vyavaharkar
9 pages
Identification of Cucumber Leaf Diseases Using Dee
No ratings yet
Identification of Cucumber Leaf Diseases Using Dee
13 pages
B1 MAJOR PROJECT PAPER
No ratings yet
B1 MAJOR PROJECT PAPER
8 pages
Depixelizing Pixel Art
100% (1)
Depixelizing Pixel Art
8 pages
Class Handout BLD127532 Structural Precastfor Revit Fastand Accurate Modelingina Single BIMModel Dan Peticila 2
No ratings yet
Class Handout BLD127532 Structural Precastfor Revit Fastand Accurate Modelingina Single BIMModel Dan Peticila 2
73 pages
PDF MATLAB Computer Vision Toolbox User S Guide The Mathworks Download
100% (3)
PDF MATLAB Computer Vision Toolbox User S Guide The Mathworks Download
52 pages
Complete Download From Building Information Modelling to Mixed Reality Cecilia Bolognesi PDF All Chapters
100% (3)
Complete Download From Building Information Modelling to Mixed Reality Cecilia Bolognesi PDF All Chapters
62 pages
Brain Tumor Detection Using MRI Images
No ratings yet
Brain Tumor Detection Using MRI Images
4 pages
Rahane 2018
No ratings yet
Rahane 2018
5 pages
Thesis On Medical Image Segmentation
100% (2)
Thesis On Medical Image Segmentation
8 pages
Convolutional Neural Network Classification of Cancer Cytopathology Images: Taking Breast Cancer As An Example
No ratings yet
Convolutional Neural Network Classification of Cancer Cytopathology Images: Taking Breast Cancer As An Example
5 pages
Wheat Disease Detection Using Image Processing
No ratings yet
Wheat Disease Detection Using Image Processing
4 pages