0% found this document useful (0 votes)

2 views

Deep Learning based Computer Vision

The document discusses artificial neural networks and deep learning, particularly in the context of image processing and computer vision. It covers topics such as digital images, image formats, spatial filtering, and the effectiveness of deep learning algorithms in tasks like object detection and recognition. Additionally, it explains the architecture of convolutional neural networks and their applications in various fields.

Uploaded by

lihit19426

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Deep Learning based Computer Vision

Uploaded by

lihit19426

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 98

ARTIFICIAL NEURAL NETWORKS AND DEEP LEARNING

Deep Learning and its role in Computer Vision

duction to Robotics

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar 1
What is a Digital Image?

Image is a two-dimensional intensity function f(x, y), where

the value of f at a spatial location (x, y) is the intensity of the
image at that point.
y
x
Gray
Level
f(x,y)

Dr. Sandeep Singh Sengar

Common image formats
– 1 sample per point (B&W) [0,1]
– 1 sample per point (Grayscale)[0-255]
– 3 samples per point (Red, Green, and Blue)[0-255]
– 4 samples per point (Red, Green, Blue, and “Alpha”, a.k.a. Opacity) [0-
255, 0-1]

Dr. Sandeep Singh Sengar

Color Image

RGB Color Space

A color image is just three functions pasted
together. We can write this as a “vector-
valued” function:

 r ( x, y ) 
f ( x, y ) =  g ( x, y ) 
 

 b ( x, y ) 


Dr. Sandeep Singh Sengar

RGB Image

Dr. Sandeep Singh Sengar

Image Processing

An image processing operation typically defines a new

image g in terms of an existing image f.
We can write the following function for image transform:

Dr. Sandeep Singh Sengar

Why Digital Image Processing?
Digital image processing focuses on two major tasks
– Improvement of pictorial information for human interpretation
– Processing of image data for storage, transmission and
representation for autonomous machine perception
Some argument about where image processing ends and
fields such as image analysis and computer vision start

Dr. Sandeep Singh Sengar

The Spatial Filtering Process
Origin x
a b c j k l
d
g
e
h
f
i
* m
p
n
q
o
r
Original Filter (w)
Simple 3*3
e 3*3 Filter Image
Neighbourhoo Pixels
d
eprocessed = n*e + j*a + k*b
+ l*c + m*d + o*f + p*g + q*h
+ r*i
y Image f (x, y)

The above is repeated for every pixel in the original

image to generate the filtered image
Dr. Sandeep Singh Sengar
Levels of Digital Image Processing
The continuum from image processing to computer vision
can be broken up into low-, mid- and high-level processes

Low Level Process Mid Level Process High Level Process

Input: Image Input: Image Input: Attributes
Output: Image Output: Attributes Output: Understanding
Examples: Noise Examples: Object Examples: Scene
removal, image recognition, understanding,
sharpening segmentation autonomous navigation

Dr. Sandeep Singh Sengar

Spatial filters
Remember that types of neighborhood:

intensity transformation: neighborhood of size 1x1

spatial filter (or mask ,kernel, template or window): neighborhood of larger size, like 3*3 mask

The spatial filter mask is moved from point to point in an image. At each point (x, y),
the response of the filter is calculated
x

Neighbourhood

(x, y) Origin

y Sengar
Dr. Sandeep Singh Image f (x, y)
Neighbourhood Operations

For each pixel in the origin image, the outcome is written on

the same location at the target image.
x Target
Original

Neighbourhood

(x, y)
Origin

y Image f (x, y)
Dr. Sandeep Singh Sengar
The Spatial Filtering Process
Origin x
a b c j k l
d
g
e
h
f
i
* m
p
n
q
o
r
Original Filter (w)
Simple 3*3
e 3*3 Filter Image
Neighbourhood Pixels
eprocessed = n*e + j*a + k*b +
l*c + m*d + o*f + p*g + q*h +
r*i
y Image f (x, y)

The above is repeated for every pixel in the original

image to generate the filtered image
Dr. Sandeep Singh Sengar
Smoothing Spatial Filtering
Origin x
104 100 108

99 106 98

95 90 85
*
1/ 100108
9 /9 /9
104 1 1 Original Filter
Simple 3*3 /9 1106
199 /9 198
/9
3*3 Smoothing Image
Neighbourhood /9 190
195 /9 185
/9
Filter Pixels

e = 1/9106 + 1/9104 + 1/9*100 +

1/ *108 + 1/ *99 + 1/ *98 + 1/ *95 +
9 9 9 9
1/ *90 + 1/ *85 = 98.3333
y Image f (x, y) 9 9

The above is repeated for every pixel in

the original image to generate the
smoothed image Dr. Sandeep Singh Sengar
Spatial filters : Smoothing
linear smoothing : averaging kernels

Standard average

Dr. Sandeep Singh Sengar

Spatial filters : Smoothing

Standard Average- example

110 120 90 130 The mask is moved

from point to point in
91 94 98 200
an image. At each
90 91 99 100 point (x,y), the
response of the filter
82 96 85 90 is calculated

Standard averaging filter:

(110 +120+90+91+94+98+90+91+99)/9 =883/9 = 98.1

Dr. Sandeep Singh Sengar

Spatial filters : Smoothing
Weighted Average- example

Dr. Sandeep Singh Sengar

Spatial filters : Smoothing
Median Filter- example

Dr. Sandeep Singh Sengar

Another smoothing example
Smoothing example
By smoothing the original image we get rid of lots of the finer detail which
leaves only the gross features for thresholding.

Original Image Smoothed Image Thresholded Image

Dr. Sandeep Singh Sengar

Averaging filter vs. median filter example
Averaging filter vs. median filter example

Original Image Image After Image After

With Noise Averaging Filter Median Filter

• Filtering is often used to remove noise from images.

• Sometimes a median filter works better than an averaging filter.

Dr. Sandeep Singh Sengar

Strange things happen at the edges!
Strange things happen at the edges! (cont …)
At the edges of an image we are missing pixels to form a neighbourhood.
Origin
x
e e

e e e
y

Image f (x, y)

Dr. Sandeep Singh Sengar

What happens when the Values of the Kernel Fall Outside

the Image??!

Dr. Sandeep Singh Sengar

border padding

Dr. Sandeep Singh Sengar

Applications

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar
Text Recognition

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar
Dr. Sandeep Singh Sengar
Dr. Sandeep Singh Sengar
Biometrics

Dr. Sandeep Singh Sengar

Computer Vision

Dr. Sandeep Singh Sengar

“One picture is worth more than
thousand words”

Dr. Sandeep Singh Sengar

Object Detection
• Moving-object detection is one of the basic and most
active research domains in the field of computer vision.
• Underlying assumptions is that moving objects generally
entail intensity changes between consecutive frames.
Object Tracking
The object tracking is used to compute the configuration
(i.e., position and size) of the target in the subsequent
frames corresponding to the state of the target in the initial
frame.
Object Recognition

Object recognition is a computer vision technique for

identifying objects in images or videos.
Medical Image Segmentation

Medical
Imaging

Dr. Sandeep Singh Sengar

What is Machine Learning?
Machine learning is a subset of Artificial Intelligence, provides
computers with the ability to learn without being explicitly
programmed.

ML came in 1950s. Defined in 1951 by “Arthur Samuel” at IBM

(designed checkers play machine):

Ref:https://www.forbes.com/sites/kalevleetaru/2019/01/15/why-machine-learning-needs-semantics-not-
just-statistics/?sh=730fa3aa77b5 36
Dr. Sandeep Singh Sengar
Branch of Machine Learning

37
Dr. Sandeep Singh Sengar
Ref: https://www.wordstream.com/blog/ws/2017/07/28/machine-learning-applications
Deep Learning
Deep Learning is a subfield of machine learning concerned
with algorithms inspired by the structure and function of the
brain called artificial neural networks.
DL/ML is used to find the algorithm (model)
Large data High performance

Dr. Sandeep Singh Sengar

Ref: https://www.intel.la/content/www/xl/es/artificial-intelligence/posts/difference-between-ai-machine-learning-deep-learning.html
Why Deep Learning Today?
▪ Better algorithms and
understanding
▪ Computational power (GPUs,
TPUs, …)
▪ Massive labelled data
▪ Variety of open source tools
and models

Slide adapted from Wai K. Dr. Sandeep Singh Sengar

End-to-end approach?

Dr. Sandeep Singh Sengar

Ref: https://lawtomated.com/a-i-technical-machine-vs-deep-learning/
Deep Learning Process
▪ A deep neural network provides state-of-the-art
accuracy in many tasks, from object detection to
speech recognition
▪ They can learn automatically, without predefined
knowledge explicitly coded by the programmers

Dr. Sandeep Singh Sengar

Effectiveness of Deep Learning
▪ Deep learning algorithms attempt to learn
representation by using a hierarchy of multiple
layers
▪ If we provide the system tons of information, it
begins to understand it and respond in useful
ways
▪ Manually designed features are often over-
specified, incomplete and take a long time to
design and validate
▪ Learned features are easy to adapt, fast to learn

Dr. Sandeep Singh Sengar

Effectiveness of Deep Learning
▪ Deep learning provides a very flexible and
universal, learnable framework for representing
world
▪ Can learn in both unsupervised and supervised
manner
▪ Utilize large amounts of training data
▪ Since 2010, deep learning started outperforming
other machine learning techniques especially in
the areas of machine vision and speech
recognition

Dr. Sandeep Singh Sengar

Deep Learning Examples
▪ Hierarchy of representations with increasing level
of abstraction
▪ Each stage is a kind of trainable nonlinear feature
transform
▪ Image recognition example
• Pixel → edge → texton → motif → part → object
▪ Text example
• Character → word → word group → clause →
sentence → story

Dr. Sandeep Singh Sengar

Deep Learning in Practice
▪ Visual question answering : Given an image and a
natural language question about the image, the
task is to provide an accurate natural language
answer
▪ Click here for demo: http://visualqa.csail.mit.edu/

Dr. Sandeep Singh Sengar

Deep Learning Architectures

Architecture Application
CNN Image recognition, video analysis, natural language processing

RNN Speech recognition, handwriting recognition, Machine Translation

Natural language text compression, handwriting recognition,

LSTM/GRU networks
speech recognition, gesture recognition, image captioning

Image recognition, information retrieval, natural language

DBN
understanding, failure prediction

DSN Information retrieval, continuous speech recognition

Dr. Sandeep Singh Sengar

The Spatial Filtering Process
Origin x
a b c j k l
d
g
e
h
f
i
* m
p
n
q
o
r
Original Filter (w)
Simple 3*3
e 3*3 Filter Image
Neighbourhood Pixels
eprocessed = n*e + j*a + k*b
+ l*c + m*d + o*f + p*g + q*h
+ r*i
y Image f (x, y)

The above is repeated for every pixel in the original

image to generate the filtered image
Dr. Sandeep Singh Sengar
Convolutional Neural Network
A Convolutional Neural Network is a Deep Learning algorithm which can take
in an input image, assign importance (learnable weights and biases) to various
aspects/objects in the image and be able to differentiate one from the other.
The pre-processing required in a CNN is much lower as compared to other
classification algorithms.

Dr. Sandeep Singh Sengar

CNN layers
An image is passed through a series of layers:
– Convolutional – filters can be thought of as feature
identifiers
⮚Nonlinear (ReLu) – approximate complex functions
– Max Pooling (down sampling)
– Fully connected layers – softmax/sigmoid
⮚ which produce an output.

Dr. Sandeep Singh Sengar

Ref: https://towardsdatascience.com/understanding-and-implementing-lenet-5-cnn-architecture-deep-learning-a2d531ebc342
Convolutional Neural Network

Dr. Sandeep Singh Sengar

Basic idea of Convolutional

Dr. Sandeep Singh Sengar

Convolutional Layer Example

Stride s=2
#filters=2
#channels=3
Padding p=1

Dr. Sandeep Singh Sengar

Size of Output
I/P size: n*n
Filter size: f*f
O/P size: (n-f+1)*(n-f+1)

Dr. Sandeep Singh Sengar

Padding and stride convolutions
Padding: It is used for same I/P and O/P size
For padding: p
O/P size=(n+2p-f+1)*(n+2p-f+1) i.e. p=(f-1)/2

Stride: s
O/P size= [(n+2p-f)/s+1]* [(n+2p-f)/s+1]

Dr. Sandeep Singh Sengar

Multiple filters
For example to detect Horizontal and vertical edges.

O/P size: (n×n×nc)(f×f×nc) --> (n-f+1)(n-f+1)*nc’

Here nc’=# of filters

Dr. Sandeep Singh Sengar

Number of parameters in one layer
Suppose 10 filters of size 3*3*3

Then total parameters will be: [333+1 (bias)]*10=280

That means one bias for each filter.

It is not dependent on the original image size (beauty of DL)

It makes model to less prone to overfitting.

Dr. Sandeep Singh Sengar

Automatically learnt features

Retain most information (edge detectors)

Towards more abstract representation

Encode high level concepts

Sparser representations:
Detect less (more abstract) features

https://towardsdatascience.com/applied-deep-learning-part-4-
convolutional-neural-networks-584bc134c1e2
Dr. Sandeep Singh Sengar
Non-linear Activation Function

Dr. Sandeep Singh Sengar

Pooling
▪ The goal of the pooling operation is to reduce the
spatial size of convolved features
▪ Pooling helps in extracting salient features which
are rotational and positional invariant
• For example, by changing the orientation of nose,
eyes and ears, the image segment would still be
detected as a head
• This is one of the most prominent features of CNNs

Dr. Sandeep Singh Sengar

Pooling
▪ Two types of pooling operators are common: Max
pooling and Average pooling
• Max pooling returns the maximum value from the
portion of the image covered by the filter
• Average pooling returns the average of all the
values from the portion of the image covered by the
filter

Dr. Sandeep Singh Sengar

Max Pooling
▪ Let’s apply a 3 x 3 filter on a 5 x 5 convolved
features map

15.5 23.8 7.9 20.6 12.9

23.8
12.7 18.3 22.3 7.9 8.3

11.3 9.2 11.8 18.9 10.3

11.7 11.3 17.5 6.8 19.3

18.3 19.6 11.2 15.2 7,2

Convolved Features

Dr. Sandeep Singh Sengar

Max Pooling
▪ Let’s apply a 3 x 3 filter on a 5 x 5 convolved
features map

15.5 23.8 7.9 20.6 12.9

23.8 23.8
12.7 18.3 22.3 7.9 8.3

11.3 9.2 11.8 18.9 10.3

11.7 11.3 17.5 6.8 19.3

18.3 19.6 11.2 15.2 7,2

Convolved Features

Dr. Sandeep Singh Sengar

Max Pooling
▪ Let’s apply a 3 x 3 filter on a 5 x 5 convolved
features map

15.5 23.8 7.9 20.6 12.9

23.8 23.8 22.3

12.7 18.3 22.3 7.9 8.3

11.3 9.2 11.8 18.9 10.3

11.7 11.3 17.5 6.8 19.3

18.3 19.6 11.2 15.2 7,2

Convolved Features

Dr. Sandeep Singh Sengar

Max Pooling
▪ Let’s apply a 3 x 3 filter on a 5 x 5 convolved
features map

15.5 23.8 7.9 20.6 12.9

23.8 23.8 22.3

12.7 18.3 22.3 7.9 8.3

18.3
11.3 9.2 11.8 18.9 10.3

11.7 11.3 17.5 6.8 19.3

18.3 19.6 11.2 15.2 7,2

Convolved Features

Dr. Sandeep Singh Sengar

Max Pooling
▪ Let’s apply a 3 x 3 filter on a 5 x 5 convolved
features map

15.5 23.8 7.9 20.6 12.9

23.8 23.8 22.3

12.7 18.3 22.3 7.9 8.3

18.3 18.9
11.3 9.2 11.8 18.9 10.3

11.7 11.3 17.5 6.8 19.3

18.3 19.6 11.2 15.2 7,2

Convolved Features

Dr. Sandeep Singh Sengar

Average Pooling
▪ Let’s apply a 3 x 3 filter on a 5 x 5 convolved
features map

15.5 23.8 7.9 20.6 12.9

14.8
12.7 18.3 22.3 7.9 8.3

11.3 9.2 11.8 18.9 10.3

11.7 11.3 17.5 6.8 19.3

18.3 19.6 11.2 15.2 7,2

Convolved Features

Dr. Sandeep Singh Sengar

Average Pooling
▪ Let’s apply a 3 x 3 filter on a 5 x 5 convolved
features map

15.5 23.8 7.9 20.6 12.9

14.8 15.6
12.7 18.3 22.3 7.9 8.3

11.3 9.2 11.8 18.9 10.3

11.7 11.3 17.5 6.8 19.3

18.3 19.6 11.2 15.2 7,2

Convolved Features

Dr. Sandeep Singh Sengar

Max Pooling

Possible Nodes in Hidden Layer i + 1

9 4x4 max
Hidden Layer i

-4 5 4 6
5 6
0 -3 2 -3 2x2 max,
8 9 non overlapping
7 8 -5 9
3 0 -4 1
5 5 6 2x2 max,
overlapping
8 8 9 (contains non-
I/P size: n*n overlapping, so
8 8 9 no need for both)
Filter size: f*f
Padding=p, Stride=s
O/P size: (n+2p-f)/s+1 Dr. Sandeep Singh Sengar
Fully Connected Layer

Dr. Sandeep Singh Sengar

Fully Connected Layer
• Simply, feed forward neural networks.
• Fully Connected Layers form the last few layers in the
network.
• The input to the fully connected layer is the output from
the final Pooling or Convolutional Layer in the flattened form.
• After passing through the fully connected layers, the final
layer uses the softmax activation function which is used to
get probabilities of the input being in a particular class
(classification).

Dr. Sandeep Singh Sengar

CNN Architectures
There are various architectures of CNNs available which have
been key in building algorithms which power and shall power AI
as a whole in the foreseeable future. Some of them have been
listed below:
• LeNet
• AlexNet
• VGGNet
• GoogLeNet
• ResNet
• ZFNet

Dr. Sandeep Singh Sengar

U-Net

Ref: Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." In International Conference on Medical image computing and computer-assisted
intervention, pp. 234-241. Springer, Cham, 2015. Dr. Sandeep Singh Sengar
Train, Validation and Test Datasets

• Training Dataset: The sample of data used to fit the model.

• Validation Dataset: The validation set is used to evaluate a given model. We as machine learning
researchers use this data to fine-tune the model hyperparameters. Hence the model occasionally
sees this data, but never does it “Learn” from this. So the validation set in a way affects a model, but
indirectly.
• Test Dataset: The sample of data used to provide an unbiased evaluation of a final model fit on the
training dataset. The Test dataset provides the gold standard used to evaluate the model. It is only
used once a model is completely trained (using the train and validation sets).

Make sure, validation and test set come from same distribution

Hyper parameters: Learning rate, #iterations, Dr.

#hidden layers,Singh
Sandeep #hidden units, choice of activation function
Sengar
Under-fitting and Over-fitting

High bias: under fitting

High variance: Overfitting
Dr. Sandeep Singh Sengar
Bias and Variance

Training set error 1% 15% 15% 0.5%

Validation set error 11% 16% 30% 1%
Result High High bias High bias and Low bias and
variance variance variance

Dr. Sandeep Singh Sengar

Bias-variance Trade-off

Dr. Sandeep Singh Sengar

Under fitting (High bias)
• A statistical model or a machine learning algorithm is said to have under
fitting when it cannot capture the underlying trend of the data.
• Under fitting destroys the accuracy of our machine learning model.
• Training accuracy is much low in this case.

Steps for reducing under fitting:

⮚ Bigger Network
⮚ Train long duration
⮚ Increase the number of parameters in the model

Dr. Sandeep Singh Sengar

Overfitting (high variance)
• Overfitting happens when your model fits too well to the training set.
• It then becomes difficult for the model to generalize to new examples
that were not in the training set.
Steps for reducing overfitting:
⮚ Add more data
⮚ Data augmentation (rotate, crop, zoom)
⮚ Simplify the model
⮚ Change the training process (like loss function)
⮚ Early termination
⮚ Regularization
❑ Dropout and drop connect
❑ L1 and L2 regularization

Dr. Sandeep Singh Sengar

Ideas to improve ML/DL strategies
• Collect more data
• Collect more diverse training examples
• Train algorithm longer with suitable optimizer
• Try bigger network
• Try smaller network
• Try dropout
• Add regularization
• Network architectures:
❑ Activation function
❑ #hidden units
❑ Learning rate
❑ Iterations

Dr. Sandeep Singh Sengar

Problems where ML/DL significantly surpasses
human level performance
• Online advertising: estimate, how likely someone will click on it
• Product recommendations
• Loan approval
• Lots of data

Dr. Sandeep Singh Sengar

CNN for Computer Vision tasks
• Object detection • Image Classification With
• Object Tracking Localization
• Recognition • Object Segmentation
• Face Recognition • Image Style Transfer
• Action and Activity • Image Colorization
Recognition • Image Reconstruction
• Human Pose Estimation • Image Super-Resolution
• Image Classification • Image Synthesis

Dr. Sandeep Singh Sengar

Challenges

The challenge of making • Difficult to simulate something as

systems human-like complex as the human visual system.
• Objects may be in variety of sizes
and aspect ratios.
• Distinguish one object from multiple
others.
• Variety of handwriting styles, curves,
and shapes employed while writing.
• Deformation, appearance variation,
scale variation, occlusion, rotation of
objects.

Computer vision has its present challenges, but the humans working on this technology are steadily
improving it. Dr. Sandeep Singh Sengar
CNN: A Real Example

Dr. Sandeep Singh Sengar

CNN: A Real Example

Dr. Sandeep Singh Sengar

CNN: A Real Example

Dr. Sandeep Singh Sengar

CNN: A Real Example

Dr. Sandeep Singh Sengar

CNN: A Real Example

Dr. Sandeep Singh Sengar

CNN: A Real Example
Filters Features Maps

Dr. Sandeep Singh Sengar

CNN: A Real Example
Filters Features Maps

Dr. Sandeep Singh Sengar

CNN: A Real Example

Dr. Sandeep Singh Sengar

CNN: A Real Example

Dr. Sandeep Singh Sengar

CNN: A Real Example

Dr. Sandeep Singh Sengar

Convolutional Neural Network
Let the task is to predict an image caption
▪ The CNN receives an image of let's say a cat

• This image, in computer term, is a collection of the pixel

▪ Generally, one layer for the greyscale picture and three
layers for a color picture
▪ During the feature learning (i.e., hidden layers), the
network will identify unique features, for instance, the
tail of the cat, the ear, etc.
▪ When the network thoroughly learned how to recognize
a picture, it can provide a probability for each image it
knows
▪ The label with the highest probability will become the
prediction of the network
Dr. Sandeep Singh Sengar
Which Works Better: RNN or CNN?
▪ There is a vast amount of neural network, where
each architecture is designed to perform a given
task
▪ CNN works very well with images
▪ RNN (Recurrent Neural Network) provides
impressive results with time series and text
analysis

Dr. Sandeep Singh Sengar

Self-Review Questions
▪ What is convolution and how it works?
▪ What is pooling and how it works?
▪ What would be the impact of large/small
striding length?

Dr. Sandeep Singh Sengar

References

“Digital Image Processing”, Rafael C.

Gonzalez & Richard E. Woods,
Addison-Wesley, 2002
– Much of the material that follows is taken from
this book

“Machine Vision: Automated Visual

Inspection and Robot Vision”, David
Vernon, Prentice Hall, 1991

Dr. Sandeep Singh Sengar

Thank You

Dr. Sandeep Singh Sengar

DIP - ch1
No ratings yet
DIP - ch1
19 pages
Project Proposal Business Presentation in Dark Blue Pink Abstract Tech Style
No ratings yet
Project Proposal Business Presentation in Dark Blue Pink Abstract Tech Style
38 pages
CS4442_CS9542_Part 2_Lecture 1_Intro_Filtering
No ratings yet
CS4442_CS9542_Part 2_Lecture 1_Intro_Filtering
40 pages
Computer Vision
No ratings yet
Computer Vision
35 pages
CS4442_CS9542_Part 2_Lecture 1_Intro_Filtering
No ratings yet
CS4442_CS9542_Part 2_Lecture 1_Intro_Filtering
52 pages
Computer Vision: Presented By: Bikram Neupane (1925101) Sudeep Shrestha (1925111) MSC - Cs III
No ratings yet
Computer Vision: Presented By: Bikram Neupane (1925101) Sudeep Shrestha (1925111) MSC - Cs III
23 pages
Introcduction To Image Processing With Python Nour Eddine ALAA and Ismail Zine El Abidne March 5, 2021
No ratings yet
Introcduction To Image Processing With Python Nour Eddine ALAA and Ismail Zine El Abidne March 5, 2021
77 pages
lecture 1 AI Summary
No ratings yet
lecture 1 AI Summary
31 pages
revisionback
No ratings yet
revisionback
13 pages
Sagar Paper
No ratings yet
Sagar Paper
4 pages
(eBook PDF) Digital Image Processing, Global Edition 4th Edition instant download
100% (1)
(eBook PDF) Digital Image Processing, Global Edition 4th Edition instant download
49 pages
(eBook PDF) Digital Image Processing, Global Edition 4th Editioninstant download
No ratings yet
(eBook PDF) Digital Image Processing, Global Edition 4th Editioninstant download
32 pages
Convolutional Neural Networks-CNN PDF
No ratings yet
Convolutional Neural Networks-CNN PDF
95 pages
unit 1
No ratings yet
unit 1
179 pages
Introduction To Digital Image Processing
100% (1)
Introduction To Digital Image Processing
81 pages
Digital Image Processing Seminar
80% (5)
Digital Image Processing Seminar
23 pages
Digital Image Processing
No ratings yet
Digital Image Processing
56 pages
Image Processing Through Machine Learning: By:-Akansh Kumar (En-1)
No ratings yet
Image Processing Through Machine Learning: By:-Akansh Kumar (En-1)
22 pages
(Ebook PDF) Digital Image Processing, Global Edition 4th Edition 2024 Scribd Download
100% (3)
(Ebook PDF) Digital Image Processing, Global Edition 4th Edition 2024 Scribd Download
41 pages
Digital Image Fundamentals
No ratings yet
Digital Image Fundamentals
50 pages
Mathematical Tool
No ratings yet
Mathematical Tool
38 pages
Computer Vision 2
No ratings yet
Computer Vision 2
62 pages
computer-vision-al-701
No ratings yet
computer-vision-al-701
50 pages
Robotics
No ratings yet
Robotics
35 pages
Chapter 2
No ratings yet
Chapter 2
66 pages
Thesis Research Deep Learning
No ratings yet
Thesis Research Deep Learning
18 pages
Lec_1
No ratings yet
Lec_1
52 pages
Facial Expression Recognition Using Artificial Neural Networks
No ratings yet
Facial Expression Recognition Using Artificial Neural Networks
6 pages
(eBook PDF) Digital Image Processing, Global Edition 4th Edition pdf download
100% (1)
(eBook PDF) Digital Image Processing, Global Edition 4th Edition pdf download
43 pages
Unit-4 Digital Imaging-Norepages30
No ratings yet
Unit-4 Digital Imaging-Norepages30
31 pages
Digital Image Processing - Lecture Notes
0% (1)
Digital Image Processing - Lecture Notes
32 pages
Paper_BackProoagation
No ratings yet
Paper_BackProoagation
13 pages
Basic Concepts
No ratings yet
Basic Concepts
5 pages
Get (eBook PDF) Digital Image Processing, Global Edition 4th Edition PDF ebook with Full Chapters Now
No ratings yet
Get (eBook PDF) Digital Image Processing, Global Edition 4th Edition PDF ebook with Full Chapters Now
47 pages
Machine - Learning (Computer Vision)
No ratings yet
Machine - Learning (Computer Vision)
56 pages
Dip Module 1 Notes
No ratings yet
Dip Module 1 Notes
33 pages
Unit-I: Digital Image Fundamentals & Image Transforms
No ratings yet
Unit-I: Digital Image Fundamentals & Image Transforms
70 pages
CV_UNIT_1
No ratings yet
CV_UNIT_1
17 pages
Week5_Computer_Vision
No ratings yet
Week5_Computer_Vision
58 pages
CV and DIP Coures Outline
No ratings yet
CV and DIP Coures Outline
3 pages
Lect02 ImageProcessingReview
No ratings yet
Lect02 ImageProcessingReview
53 pages
Digital Image Processing: Instructor: Namrata Vaswani
No ratings yet
Digital Image Processing: Instructor: Namrata Vaswani
27 pages
DIP Notes Unit-1 PPT (1)
No ratings yet
DIP Notes Unit-1 PPT (1)
62 pages
CV_SVD_L01_P1_Intro
No ratings yet
CV_SVD_L01_P1_Intro
35 pages
Guide Convolutional Neural Network CNN
100% (1)
Guide Convolutional Neural Network CNN
25 pages
Image Processing
No ratings yet
Image Processing
18 pages
CV Ss16 0609 Deep Learning
No ratings yet
CV Ss16 0609 Deep Learning
91 pages
Machine Vision
100% (4)
Machine Vision
453 pages
Machine Vision
No ratings yet
Machine Vision
453 pages
Digital Image Processing Lecture Notes
No ratings yet
Digital Image Processing Lecture Notes
342 pages
L1 2023 (1)
No ratings yet
L1 2023 (1)
20 pages
Computer Vision Class 10 Notes
No ratings yet
Computer Vision Class 10 Notes
5 pages
Lec 1 Image Processing
No ratings yet
Lec 1 Image Processing
59 pages
Chapter1 CV
No ratings yet
Chapter1 CV
29 pages
1_
No ratings yet
1_
39 pages
DIP Fundamentals
No ratings yet
DIP Fundamentals
52 pages
unit-1-computer-vision-notes_copy
No ratings yet
unit-1-computer-vision-notes_copy
11 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
No ratings yet
AN IMPROVED TECHNIQUE FOR MIX NOISE AND BLURRING REMOVAL IN DIGITAL IMAGES
From Everand
AN IMPROVED TECHNIQUE FOR MIX NOISE AND BLURRING REMOVAL IN DIGITAL IMAGES
UTKARSH SHUKLA
No ratings yet
4000 PDF Ebooks Free Download Links
0% (4)
4000 PDF Ebooks Free Download Links
116 pages
Ai Record Work
No ratings yet
Ai Record Work
20 pages
Chapter 1
No ratings yet
Chapter 1
3 pages
Program-Exit-Survey-for-Student-BCA
No ratings yet
Program-Exit-Survey-for-Student-BCA
2 pages
WWW Socialpilot Co Blog Social Media Image Sizes
No ratings yet
WWW Socialpilot Co Blog Social Media Image Sizes
16 pages
Skin Lesion Using Support Vector Machine-Main
100% (1)
Skin Lesion Using Support Vector Machine-Main
6 pages
NoteBook Catalog 2024 Ver.2
No ratings yet
NoteBook Catalog 2024 Ver.2
50 pages
Number Systems: Location in Course Textbook
No ratings yet
Number Systems: Location in Course Textbook
64 pages
Devops Shack 50 Complex Kubernetes Scenario-Based Q&A: 1. Scenario: Zero-Downtime Deployment For Multiple Services
No ratings yet
Devops Shack 50 Complex Kubernetes Scenario-Based Q&A: 1. Scenario: Zero-Downtime Deployment For Multiple Services
45 pages
WINE QUALITY PREDICTOR ppt
0% (1)
WINE QUALITY PREDICTOR ppt
9 pages
EG8145X6 Datasheet 01: Huawei Intelligent GPON and Wi-Fi 6 Routing-Type ONT
No ratings yet
EG8145X6 Datasheet 01: Huawei Intelligent GPON and Wi-Fi 6 Routing-Type ONT
3 pages
Bütün-Beyinli Çocuk - Daniel J. Siegel - Tina Payne Bryson. (2017!04!12 21-09-42 UTC)
No ratings yet
Bütün-Beyinli Çocuk - Daniel J. Siegel - Tina Payne Bryson. (2017!04!12 21-09-42 UTC)
259 pages
Linux Commands Cheat Sheet
No ratings yet
Linux Commands Cheat Sheet
1 page
AirlandFS Manual 1.3
No ratings yet
AirlandFS Manual 1.3
29 pages
Final Full Document
No ratings yet
Final Full Document
72 pages
l2vpn and Ethernet Services Configuration Guide for Ios-xr
No ratings yet
l2vpn and Ethernet Services Configuration Guide for Ios-xr
836 pages
Exponent Properties PDF
No ratings yet
Exponent Properties PDF
2 pages
Java Unit-3
No ratings yet
Java Unit-3
16 pages
Cablu Termosensibil - Pozare
No ratings yet
Cablu Termosensibil - Pozare
2 pages
Inbox 4
No ratings yet
Inbox 4
1 page
Functional Specification Template
100% (1)
Functional Specification Template
24 pages
MS Office 2010 Product Keys
No ratings yet
MS Office 2010 Product Keys
5 pages
MCQ's For LDCO Unit V
No ratings yet
MCQ's For LDCO Unit V
18 pages
eQUEST TUTORIAL #1
No ratings yet
eQUEST TUTORIAL #1
12 pages
Download Complete Business the Richard Branson Way 10 Secrets of the World s Greatest Brand Builder 3rd Edition Des Dearlove PDF for All Chapters
100% (3)
Download Complete Business the Richard Branson Way 10 Secrets of the World s Greatest Brand Builder 3rd Edition Des Dearlove PDF for All Chapters
44 pages
3DS 2018 SWK Launch2019 DataSheet PCB
No ratings yet
3DS 2018 SWK Launch2019 DataSheet PCB
2 pages
Idixel x800 Export Manual
No ratings yet
Idixel x800 Export Manual
6 pages
Title Intro Updated 1
No ratings yet
Title Intro Updated 1
30 pages
Normalization Example
No ratings yet
Normalization Example
14 pages
Essential Elements 2000 Pluspdf
100% (1)
Essential Elements 2000 Pluspdf
48 pages

Deep Learning based Computer Vision

Uploaded by

Deep Learning based Computer Vision

Uploaded by

ARTIFICIAL NEURAL NETWORKS AND DEEP LEARNING

Deep Learning and its role in Computer Vision

Dr. Sandeep Singh Sengar

Image is a two-dimensional intensity function f(x, y), where

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

RGB Color Space

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

An image processing operation typically defines a new

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

The above is repeated for every pixel in the original

Low Level Process Mid Level Process High Level Process

Dr. Sandeep Singh Sengar

intensity transformation: neighborhood of size 1x1

For each pixel in the origin image, the outcome is written on

The above is repeated for every pixel in the original

e = 1/9*106 + 1/9*104 + 1/9*100 +

The above is repeated for every pixel in

Dr. Sandeep Singh Sengar

Spatial filters : Smoothing

110 120 90 130 The mask is moved

Standard averaging filter:

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Original Image Smoothed Image Thresholded Image

Dr. Sandeep Singh Sengar

Original Image Image After Image After

• Filtering is often used to remove noise from images.

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

What happens when the Values of the Kernel Fall Outside

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Object recognition is a computer vision technique for

Dr. Sandeep Singh Sengar

ML came in 1950s. Defined in 1951 by “Arthur Samuel” at IBM

Dr. Sandeep Singh Sengar

Slide adapted from Wai K. Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

RNN Speech recognition, handwriting recognition, Machine Translation

Natural language text compression, handwriting recognition,

Image recognition, information retrieval, natural language

DSN Information retrieval, continuous speech recognition

Dr. Sandeep Singh Sengar

The above is repeated for every pixel in the original

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

Dr. Sandeep Singh Sengar

O/P size: (n×n×nc)*(f×f×nc) --> (n-f+1)*(n-f+1)*nc’

Here nc’=# of filters

Dr. Sandeep Singh Sengar

Then total parameters will be: [3*3*3+1 (bias)]*10=280

That means one bias for each filter.

It is not dependent on the original image size (beauty of DL)

Dr. Sandeep Singh Sengar

Retain most information (edge detectors)

Towards more abstract representation

Encode high level concepts

Dr. Sandeep Singh Sengar

e = 1/9106 + 1/9104 + 1/9*100 +

O/P size: (n×n×nc)(f×f×nc) --> (n-f+1)(n-f+1)*nc’

Then total parameters will be: [333+1 (bias)]*10=280