0% found this document useful (0 votes)
20 views20 pages

Lec 11

The document discusses image classification, including the process of classifying images into categories and challenges involved. It also covers commonly used datasets for image classification like MNIST, CIFAR100, and ImageNet, as well as algorithms like K-Nearest Neighbors and convolutional neural networks.

Uploaded by

Basant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views20 pages

Lec 11

The document discusses image classification, including the process of classifying images into categories and challenges involved. It also covers commonly used datasets for image classification like MNIST, CIFAR100, and ImageNet, as well as algorithms like K-Nearest Neighbors and convolutional neural networks.

Uploaded by

Basant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Image Classification

Image Classification

 Input:
o Image
 Output:
o Assign image to one of a fixed set of categories
 Problem:
o Semantic Gap

 what computer sees


o An image is just a big grid of numbers between [0, 255]
 E.g. 800 x 600 x 3 (3 channels RGB)

Image Classification Challenges

 Viewpoint Variation.
 Intraclass Variation.
 Fine-Grained Categories.
 Background Clutter.
 Illumination Changes.
 Deformation.
 Occlusion.

1|Page
Image Classification Applications

 Object detection.
 Disease diagnosis based on medical image analysis.
 Image captioning.
 Playing games (such as GO).
 Activity analysis.

Image Classification Based on Data-driven Approach

1. Collect a dataset of images and labels.

2. Use Machine Learning to train a classifier.

3. Evaluate the classifier on new images.

Example :- training set ( airplane, automobile, bird, cat, deer)

Image Classification Datasets: MNIST

 10 classes : Digits 0 to 9
 28 x 28 grayscale images
 50k training images
 10k test images

2|Page
Image Classification Datasets: CIFAR100

 100 classes
 50k training images
 10k testing images (100 per class)
 32 x 32 RGB images
 20 superclass with 5 classes each :-
 Aquatic mammals: beaver, dolphin, otter, seal, whale
 Trees : Maple, oak, palm, pine, willow

Image Classification Datasets: ImageNet

 1000 classes
 1.3M training images (1.3k per class)
 50k validation images (50 per class)
 100k test images (100 per class)
 Performance metric: Top 5 accuracy Algorithm predicts 5 labels for each image,
one of them needs to be right

Image Classification Datasets: MIT Places

 365 classes of different types


 8M training images
 18.25k val images (50 per class)
 328.5k test images (900 per class)
 Images have variable size, often resize to 256 x 256 for training

3|Page
Comparison between Image classification Datasets:-

MNIST CIFAR100 ImageNet MIT Places

Size 28 x 28 32 x 32 RGB Variable size, often


grayscale images to 256 x 256

No. classes 10 (0 to 9) 100 1000 365

No. training 50k 50k 1.3M 8M


images

No. testing 10k 10k 100k 328.5k


images

K-Nearest Neighbor (KNN)

 Training:
o Memorize all data and labels.

 Testing/Prediction:
o Predict the label of the most similar training image.

 It uses distance metric to compare images, such as L1 distance

4|Page
Nearest Neighbor Decision Boundaries

K-Nearest Neighbors

5|Page
K-Nearest Neighbors (cont’d)
 Instead of copying label from nearest neighbor,
o take majority vote from k closest points.

 Using more neighbors


o helps smooth out rough decision boundaries.
o Also, it helps reduce the effect of outliers.

 When K¿1 there can be ties between classes.


o Need to break somehow.

 With the right choice of distance metric,


o we can apply KNN to any type of data.

KNN: Distance Metric

6|Page
KNN: Hyperparameters
 What is the best
o value of K to use?
o distance metric to use?

 examples of hyperparameters:
o choices about ourlearning algorithm that we don’t learn from the training
data
o instead we set them at the start of the learning process.

 Very problem-dependent.
o In general need to try them all and see what works best for our data/task

General Notes
 Universal Approximation:
o As the number of training samples goes to infinity,
 KNN can represent any function

 Curse of dimensionality:
o For uniform coverage of space,
 number of training points needed grows exponentially with
dimension.

 KNN on raw pixels is seldom used:


o Very slow at test time.
o Distance metrics on pixels are not informative.

 KNN with ConvNet features works well.

7|Page
Neural Networks

Backpropagation Model

8|Page
Deep Neural Networks
Deep Learning: Hierarchical learning algorithms with many

Activation Functions
ReLU is a good default choice for most problems.

9|Page
Components of a Convolutional Network

 Fully-Connected-Layers
 Activation Function
 Convolution Layers
 Pooling Layers
 Normalization

Fully Connected Layer

10 | P a g e
Convolution Layer
 Filters always extend the full depth of the input volume
 Convolve the filter with the image spatially
 computing dot products

11 | P a g e
Stacking Convolutions

12 | P a g e
What do convolutional filters learn?

 Linear classifier:
o One template per class

 First-layer conv filters:


o local image templates (often learns oriented edges, opposing colors)

13 | P a g e
Receptive Fields

Convolution Summary

14 | P a g e
Pooling Layers

Max Pooling

15 | P a g e
Pooling Summary

Convolutional Networks

16 | P a g e
Example: LeNet-5

Batch Normalization

 Idea :
o Normalize the outputs of a layer so they have zero mean and unit
variance
 Why?
o Helps reduce internal covariate shift
o Improve optimization

17 | P a g e
 We can normalize a batch of activations like this :

Batch Normalization: Test-Time

18 | P a g e
Batch Normalization for ConvNets

Batch Normalization

 Usually inserted after Fully Connected or Convolutional layers, and before


nonlinearity
 Makes deep networks much easier to train
 Allows higher become more robust to initialization
 Acts as regularization during training
 Zero overhead at test-time:
o can be fused with conv
 Not well-understood theoretically (yet)
 Behaves differently during training and testing:
o this is a very common source of bugs

19 | P a g e
Group Normalization

20 | P a g e

You might also like