0% found this document useful (0 votes)
36 views61 pages

Image Classification Using Convolutional Neural Networks (CNNS)

This document provides an overview of image classification using convolutional neural networks (CNNs). It discusses CNN architecture, including convolutional, pooling and fully connected layers. Popular CNN models are described like LeNet-5, AlexNet and VGGNet. Training CNNs involves backpropagation to update weights and minimize loss. Overfitting is a risk that can be addressed through regularization techniques like dropout. The document serves as an introduction to image classification with CNNs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views61 pages

Image Classification Using Convolutional Neural Networks (CNNS)

This document provides an overview of image classification using convolutional neural networks (CNNs). It discusses CNN architecture, including convolutional, pooling and fully connected layers. Popular CNN models are described like LeNet-5, AlexNet and VGGNet. Training CNNs involves backpropagation to update weights and minimize loss. Overfitting is a risk that can be addressed through regularization techniques like dropout. The document serves as an introduction to image classification with CNNs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Image Classification using

Convolutional Neural Networks (CNNs)

MSc. Jonas Krause


Prof. Dr. Lipyeow Lim
Prof. Dr. Kyungim Baek
Agenda
Introduction
• What we see vs. What computers see (MNIST and CIFAR Datasets)
• Hand-Crafted Features for Image Classification
Deep Learning
• Convolutional Neural Networks (CNNs)
•  Architecture (Convolutional, Pooling, and Fully Connected Layers)
•  Successful CNN Architectures
Training
• Backpropagation
• Overfitting, Regularization and Dropout
Experiments
Transfer Learning
Complex Networks
Slide 2
Agenda
Introduction
• What we see vs. What computers see (MNIST and CIFAR Datasets)
• Hand-Crafted Features for Image Classification
Deep Learning
• Convolutional Neural Networks (CNNs)
•  Architecture (Convolutional, Pooling, and Fully Connected Layers)
•  Successful CNN Architectures
Training
• Backpropagation
• Overfitting, Regularization and Dropout
Experiments
Transfer Learning
Complex Networks
Slide 3
Introduction
•  Image classification is the task of taking an input image and outputting a
class or a probability of classes that best describes the image
•  For humans, this task is one of the first skills we learn and it comes naturally
and effortlessly as adults

•  Being able to quickly recognize patterns, generalize from prior knowledge,


Slide and
4 adapt to different image environments are difficult tasks for machines
Introduction
What we see vs. What computers see

Slide 5
Introduction
MNIST Dataset (http://yann.lecun.com/exdb/mnist/)

• 60,000 training examples

• 10,000 test examples

• Rank of best Classifiers


and Errors

• Currently Best Accuracy:

•  Ciresan et al. CVPR 2012 ! 99.77%

Slide 6
Introduction
MNIST Dataset

Slide 7
Introduction
CIFAR-10 Dataset (https://www.cs.toronto.edu/~kriz/cifar.html)

• Consists of 60,000 32x32 color images in 10 classes, with 6,000 images

per class. There are 50,000 training images and 10,000 test images.

Slide 8
Introduction
Color Images

Slide 9
Introduction
Color Images

Slide 10
Introduction

Image Classification (previous Deep Learning)

• Hand-Craft Features

• Texture Features: Histogram based, Entropy, Haralick features (Co-

occurrence matrix), Gray-level run length metrics, Local Binary Pattern,

Fractal, etc.

• Morphological Features: Hu's moments, Shape features,

Granulometry, Bending Energy, Roundness ratio, etc.

Slide 11
Agenda
Introduction
• What we see vs. What computers see (MNIST and CIFAR Datasets)
• Hand-Crafted Features for Image Classification
Deep Learning
• Convolutional Neural Networks (CNNs)
•  Architecture (Convolutional, Pooling, and Fully Connected Layers)
•  Successful CNN Architectures
Training
• Backpropagation
• Overfitting, Regularization and Dropout
Experiments
Transfer Learning
Complex Networks
Slide 12
Deep Learning (DL)
"Deep Learning is a new area of Machine Learning, which has been
introduced with the objective of moving Machine Learning closer to
one of its original goals: Artificial Intelligence.” http://deeplearning.net/

• Key Concepts of Deep Neural Networks


•  Deep-learning networks are distinguished from the more common single-hidden-
layer neural networks by their depth

•  More than three layers (including input and output) qualifies as “deep” learning

•  In deep-learning networks, each layer of nodes trains on a distinct set of


features based on the previous layer’s output

•  The further you advance into the neural net, the more complex the features
your nodes can recognize, since they aggregate and recombine features from
the previous layer
Slide 13
Deep Learning (DL)

Different DL Models:
• Deep Neural Network

• Deep Boltzmann Machine


• Restricted Boltzmann Machine
• Deep Belief Networks
• Deep Autoencoders

• Recurrent Neural Networks


• Convolutional Neural Networks

Slide 14
Convolutional Neural Networks (CNNs)
•  CNNs take a biological inspiration from the visual cortex
•  The visual cortex has small regions of cells that are sensitive to
specific regions of the visual field
•  For example, some neurons fired
when exposed to vertical edges
and some when shown horizontal
or diagonal edges

•  Having the neuronal cells in the


visual cortex looking for specific
characteristics is the basis behind
CNNs

Slide 15
Convolutional Neural Networks (CNNs)

Network Architecture
• Convolutional Layer, Pooling Layer, Fully Connected Layer

Slide 16
Convolutional Neural Networks (CNNs)

Convolution Operator

x =
Input Image (I) Filter (K)

• The 3×3 matrix (K) is called a ‘filter‘ or ‘kernel’ or ‘feature detector’


and the matrix formed by sliding the filter over the image and
computing the dot product is called the ‘Convolved Feature’ or
‘Activation Map’ or the ‘Feature Map‘.
Slide 17
Convolutional Neural Networks (CNNs)

Slide 18
Convolutional Neural Networks (CNNs)

Slide 19
Convolutional Neural Networks (CNNs)

Convolution Operator
• Different filters will produce
different Feature Maps for the
same input image. For example:

Input Image

Slide 20
Convolutional Neural Networks (CNNs)

Slide 21
Convolutional Neural Networks (CNNs)

Slide 22
Convolutional Neural Networks (CNNs)

Convolutional Layer
• In practice, a CNN learns the values of these filters on its own during
the training process

• Although we still need to specify parameters such as number of


filters, filter size, padding, and stride before the training process

Slide 23
Convolutional Neural Networks (CNNs)

Activation Layer (ReLU)


• An additional operation called Rectified Linear Unit (ReLU) has been
used after every Convolution operation

• Basically, ReLU is an element wise operation (applied per pixel) and


replaces all negative pixel values in the feature map by zero
• The purpose of ReLU is to introduce
- non-linearity to the network
Slide 24
Convolutional Neural Networks (CNNs)

Activation Layer (ReLU)

• Other non linear functions such as tanh or sigmoid can also be


used instead of ReLU, but ReLU has been found to perform better in
most situations.
Slide 25
Convolutional Neural Networks (CNNs)

Pooling Layer
• Pooling layer downsamples the volume spatially, independently in
each depth slice of the input

• The most common downsampling operation is max, giving rise to max


pooling, here shown with a stride of 2
Slide 26
Convolutional Neural Networks (CNNs)

Fully Connected Layer


• Neurons in a fully connected layer have full connections to all
activations in the previous layer, as seen in regular neural networks

Slide 27
Convolutional Neural Networks (CNNs)

Architectures

Slide 28
Convolutional Neural Networks (CNNs)

Example: Input >> [ [ Conv >> ReLU ] * 2 >> Pool ] * 3 >> FC

Slide 29
Convolutional Neural Networks (CNNs)

In summary:
• A CNN is in the simplest case a list of Layers that transform the
image volume into an output volume (e.g. class scores)

• There are a few distinct types of Layers


(e.g. CONV/RELU/POOL/FC are by far the most popular)
• Each Layer may or may not have parameters

(e.g. CONV/FC do, RELU/POOL don’t)


• Each Layer may or may not have additional hyperparameters
(e.g. CONV/FC/POOL do, RELU doesn’t)

Slide 30
Successful CNN architectures

LeNet-5
• This architecture is an excellent “first architecture” for a CNN

Slide 31
Successful CNN architectures

AlexNet
• Famous for winning the ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) in 2012

Slide 32
Successful CNN architectures

VGGNet

Slide 33
Successful CNN architectures

AlexNet vs. VGGNet (16 and 19)

Slide 34
Agenda
Introduction
• What we see vs. What computers see (MNIST and CIFAR Datasets)
• Hand-Crafted Features for Image Classification
Deep Learning
• Convolutional Neural Networks (CNNs)
•  Architecture (Convolutional, Pooling, and Fully Connected Layers)
•  Successful CNN Architectures
Training
• Backpropagation
• Overfitting, Regularization and Dropout
Experiments
Transfer Learning
Complex Networks
Slide 35
Training

Backpropagation
• Algorithm to calculate all weights and biases
• Cost Function

• Minimize gradient of the cost function


•  This is the mathematical equivalent of a dL/
dW where W are the weights at a particular
layer

Slide 36
Training

Backpropagation
• Weight Updates

• Learning Rate
•  Parameter chosen by the programmer

•  A high learning rate means that bigger steps are


taken in the weight updates

•  However, a learning rate that is too high could


result in jumps that are too large and not precise
enough to reach the optimal point

Slide 37
Training

Overfitting
• Our model might have learned the training set (along with any noise present
within it) perfectly, but it has failed to capture the underlying process that
generated it

•  On CNNs, overfitting may occur if we


don't have sufficiently training
examples, then a small group of
neurons might become responsible for
doing most of the processing and other
neurons becoming redundant

Slide 38
Training

Regularization
• Rather than reducing the number of parameters, for CNNs we impose
constraints on the model parameters during training to keep them from learning
the noise in the training data
• Dropout: This has the effect of forcing the neural network to cope
with failures, and not to rely on existence of a particular neuron (or set of
neurons) – relying more on a consensus of several neurons within a layer

Slide 39
Agenda
Introduction
• What we see vs. What computers see (MNIST and CIFAR Datasets)
• Hand-Crafted Features for Image Classification
Deep Learning
• Convolutional Neural Networks (CNNs)
•  Architecture (Convolutional, Pooling, and Fully Connected Layers)
•  Successful CNN Architectures
Training and Testing
• Backpropagation
• Overfitting, Regularization and Dropout
Experiments
Transfer Learning
Complex Networks
Slide 40
Experiments
Malaria Recognition
• Training Dataset:

• Adapted AlexNet:

Slide 41
Experiments
Malaria Recognition
• Feature Maps Learned:

• Thin Blood Smear Analysis Framework:

Slide 42
Experiments
Malaria Recognition
• Results:

Slide 43
Experiments
Plant Recognition
• Dataset ! Plants in Natural Images (natural background)
• Step 1: Segmentation
•  Using MIT Scene Parsing, pre-trained model (ADE20K dataset)

Slide 44
Experiments
Plant Recognition
• MIT Scene Parsing ! Stacked CNNs

Slide 45
Experiments
Plant Recognition
• Initial Results:

Slide 46
Experiments
Plant Recognition
• Initial Results:

Slide 47
Agenda
Introduction
• What we see vs. What computers see (MNIST and CIFAR Datasets)
• Hand-Crafted Features for Image Classification
Deep Learning
• Convolutional Neural Networks (CNNs)
•  Architecture (Convolutional, Pooling, and Fully Connected Layers)
•  Successful CNN Architectures
Training
• Backpropagation
• Overfitting, Regularization and Dropout
Experiments
Transfer Learning
Complex Networks
Slide 48
Transfer Learning
•  Most pre-trained models used the ImageNet Dataset
(1 Million of images and 1,000 Classes)

Slide 49
Agenda
Introduction
• What we see vs. What computers see (MNIST and CIFAR Datasets)
• Hand-Crafted Features for Image Classification
Deep Learning
• Convolutional Neural Networks (CNNs)
•  Architecture (Convolutional, Pooling, and Fully Connected Layers)
•  Successful CNN Architectures
Training
• Backpropagation
• Overfitting, Regularization and Dropout
Experiments
Transfer Learning
Complex Networks
Slide 50
Complex Networks
Deconvolutional Neural Networks (DCNN)

Slide 51
Complex Networks
GoogLeNet (ILSVRC 2014 winner)

Slide 52
Complex Networks
GoogLeNet
• Inception Modules

Slide 53
Complex Networks
GoogLeNet
• Inception Modules (Network inside a network)

Slide 54
Complex Networks
ResNet (ILSVRC 2015 winner)

Slide 55
Slide 56
Complex Networks
ResNet
• Residual Block

Slide 57
Complex Networks
CUImage (Fast Region-based CNN)
• ILSVRC 2016 winner

Slide 58
Complex Networks
SENet & SE-ResNet (Squeeze-and-Excitation)
• ILSVRC 2017 winner

Slide 59
That’s all folks!!! Thank you!

Slide 60
Annex I
CNN useful links:
• http://cs231n.github.io/convolutional-networks/

• https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner%27s-
Guide-To-Understanding-Convolutional-Neural-Networks/

• https://cambridgespark.com/content/tutorials/convolutional-neural-networks-
with-keras/index.html#fnref1

• https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/

• https://docs.gimp.org/en/plug-in-convmatrix.html

Slide 61

You might also like