0% found this document useful (0 votes)
63 views

Lecture 2 PDF

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Lecture 2 PDF

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Lecture 2:

Image Classification
A Core Task in Computer Vision

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 1 April 1, 2021


Administrative: Assignment 1
Due 4/16 11:59pm

- K-Nearest Neighbor
- Linear classifiers: SVM, Softmax
- Two-layer neural network
- Image features

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 2 April 1, 2021


Administrative: Course Project

Project proposal due 4/19 (Monday)

Find your teammates on Piazza (the pinned “Search for Teammates” post)

Collaboration: Slack / Zoom

“Is X a valid project for 231n?” --- Piazza private post / TA Office Hours

More info on the website

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 3 April 1, 2021


Administrative: Sections

This Friday 11:30-12:30 pm (recording will be made available)

Python / Numpy, Google Cloud Platform, Google Colab

Presenter: Rachel Gardner (TA)

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 4 April 1, 2021


Syllabus

Neural Network Fundamentals Convolutional Neural Networks Computer Vision Applications

Data-driven approaches Convolutions RNNs / LSTMs / Transformers


Linear classification & kNN Pytorch 1.4 / Tensorflow 2.0 Image captioning
Loss functions Activation functions Interpreting neural networks
Optimization Batch normalization Style transfer
Backpropagation Transfer learning Adversarial examples
Multi-layer perceptrons Data augmentation Fairness & ethics
Neural Networks Momentum / RMSProp / Adam Human-centered AI
Architecture design 3D vision
Deep reinforcement learning
Scene graphs
Self-supervised learning

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 5 April 1, 2021


Lecture 2:
Image Classification
A Core Task in Computer Vision

Today:
● The image classification task
● Two basic data-driven approaches to image classification
○ K-nearest neighbor and linear classifier

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 6 April 1, 2021


Image Classification: A core task in Computer Vision

cat

This image by Nikita is


licensed under CC-BY 2.0

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 7 April 1, 2021


The Problem: Semantic Gap

What the computer sees

An image is a tensor of integers


between [0, 255]:
This image by Nikita is
licensed under CC-BY 2.0 e.g. 800 x 600 x 3
(3 channels RGB)

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 8 April 1, 2021


Challenges: Viewpoint variation

All pixels change when


the camera moves!

This image by Nikita is


licensed under CC-BY 2.0

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 9 April 1, 2021


Challenges: Background Clutter

This image is CC0 1.0 public domain This image is CC0 1.0 public domain

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 10 April 1, 2021


Challenges: Illumination

This image is CC0 1.0 public domain This image is CC0 1.0 public domain This image is CC0 1.0 public domain This image is CC0 1.0 public domain

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 11 April 1, 2021


Challenges: Occlusion

This image by jonsson is licensed


This image is CC0 1.0 public domain This image is CC0 1.0 public domain
under CC-BY 2.0

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 12 April 1, 2021


Challenges: Deformation

This image by Umberto Salvagnin This image by sare bear is This image by Tom Thai is
This image by Umberto Salvagnin
is licensed under CC-BY 2.0 licensed under CC-BY 2.0 licensed under CC-BY 2.0
is licensed under CC-BY 2.0

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 13 April 1, 2021


Challenges: Intraclass variation

This image is CC0 1.0 public domain

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 14 April 1, 2021


An image classifier

Unlike e.g. sorting a list of numbers,

no obvious way to hard-code the algorithm for


recognizing a cat, or other classes.

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 15 April 1, 2021


Attempts have been made

Find edges Find corners

?
John Canny, “A Computational Approach to Edge Detection”, IEEE TPAMI 1986

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 16 April 1, 2021


Machine Learning: Data-Driven Approach
1. Collect a dataset of images and labels
2. Use Machine Learning algorithms to train a classifier
3. Evaluate the classifier on new images
Example training set

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 17 April 1, 2021


Nearest Neighbor Classifier

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 18 April 1, 2021


First classifier: Nearest Neighbor

Memorize all
data and labels

Predict the label


of the most similar
training image

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 19 April 1, 2021


First classifier: Nearest Neighbor
?

deer bird plane cat car

Training data with labels

query data

Distance Metric

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 20 April 1, 2021


Distance Metric to compare images

L1 distance:

add

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 21 April 1, 2021


Nearest Neighbor classifier

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 22 April 1, 2021


Nearest Neighbor classifier

Memorize training data

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 23 April 1, 2021


Nearest Neighbor classifier

For each test image:


Find closest train image
Predict label of nearest image

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 24 April 1, 2021


Nearest Neighbor classifier

Q: With N examples,
how fast are training
and prediction?

Ans: Train O(1),


predict O(N)

This is bad: we want


classifiers that are fast
at prediction; slow for
training is ok

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 25 April 1, 2021


Nearest Neighbor classifier

Many methods exist for


fast / approximate nearest
neighbor (beyond the
scope of 231N!)

A good implementation:
https://github.com/facebookresearch/faiss

Johnson et al, “Billion-scale similarity search with


GPUs”, arXiv 2017

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 26 April 1, 2021


What does this look like?

1-nearest neighbor
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 27 April 1, 2021
K-Nearest Neighbors
Instead of copying label from nearest neighbor,
take majority vote from K closest points

K=1 K=3 K=5

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 28 April 1, 2021


K-Nearest Neighbors: Distance Metric

L1 (Manhattan) distance L2 (Euclidean) distance

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 29 April 1, 2021


K-Nearest Neighbors: Distance Metric

L1 (Manhattan) distance L2 (Euclidean) distance

K=1 K=1

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 30 April 1, 2021


K-Nearest Neighbors: try it yourself!

http://vision.stanford.edu/teaching/cs231n-demos/knn/

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 31 April 1, 2021


Hyperparameters

What is the best value of k to use?


What is the best distance to use?

These are hyperparameters: choices about


the algorithms themselves.

Very problem/dataset-dependent.
Must try them all out and see what works best.

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 32 April 1, 2021


Setting Hyperparameters
Idea #1: Choose hyperparameters
that work best on the training data

train

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 33 April 1, 2021


Setting Hyperparameters
Idea #1: Choose hyperparameters BAD: K = 1 always works
that work best on the training data perfectly on training data

train

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 34 April 1, 2021


Setting Hyperparameters
Idea #1: Choose hyperparameters BAD: K = 1 always works
that work best on the training data perfectly on training data

train

Idea #2: choose hyperparameters


that work best on test data
train test

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 35 April 1, 2021


Setting Hyperparameters
Idea #1: Choose hyperparameters BAD: K = 1 always works
that work best on the training data perfectly on training data

train

Idea #2: choose hyperparameters BAD: No idea how algorithm


that work best on test data will perform on new data
train test

Never do this!

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 36 April 1, 2021


Setting Hyperparameters
Idea #1: Choose hyperparameters BAD: K = 1 always works
that work best on the training data perfectly on training data

train

Idea #2: choose hyperparameters BAD: No idea how algorithm


that work best on test data will perform on new data
train test

Idea #3: Split data into train, val; choose Better!


hyperparameters on val and evaluate on test
train validation test

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 37 April 1, 2021


Setting Hyperparameters
train

Idea #4: Cross-Validation: Split data into folds,


try each fold as validation and average the results

fold 1 fold 2 fold 3 fold 4 fold 5 test

fold 1 fold 2 fold 3 fold 4 fold 5 test

fold 1 fold 2 fold 3 fold 4 fold 5 test

Useful for small datasets, but not used too frequently in deep learning

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 38 April 1, 2021


Example Dataset: CIFAR10
10 classes
50,000 training images
10,000 testing images

Alex Krizhevsky, “Learning Multiple Layers of Features from Tiny Images”, Technical Report, 2009.

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 39 April 1, 2021


Example Dataset: CIFAR10
10 classes
50,000 training images
10,000 testing images Test images and nearest neighbors

Alex Krizhevsky, “Learning Multiple Layers of Features from Tiny Images”, Technical Report, 2009.

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 40 April 1, 2021


Setting Hyperparameters Example of
5-fold cross-validation
for the value of k.

Each point: single


outcome.

The line goes


through the mean, bars
indicated standard
deviation

(Seems that k ~= 7 works best


for this data)

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 41 April 1, 2021


What does this look like?

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 42 April 1, 2021


What does this look like?

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 43 April 1, 2021


k-Nearest Neighbor with pixel distance never used.

- Distance metrics on pixels are not informative


- Very slow at test time
Original Occluded Shifted (1 pixel) Tinted

Original image is
CC0 public domain

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 44 April 1, 2021


k-Nearest Neighbor with pixel distance never used.
Dimensions = 3
- Curse of dimensionality Points = 43

Dimensions = 2
Points = 42

Dimensions = 1
Points = 4

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 45 April 1, 2021


K-Nearest Neighbors: Summary
In image classification we start with a training set of images and labels, and
must predict labels on the test set

The K-Nearest Neighbors classifier predicts labels based on the K nearest


training examples

Distance metric and K are hyperparameters

Choose hyperparameters using the validation set;

Only run on the test set once at the very end!

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 46 April 1, 2021


Linear Classifier

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 47 April 1, 2021


Parametric Approach

Image

10 numbers giving
f(x,W) class scores
Array of 32x32x3 numbers
(3072 numbers total)
W
parameters
or weights

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 48 April 1, 2021


Parametric Approach: Linear Classifier

Image
f(x,W) = Wx
10 numbers giving
f(x,W) class scores
Array of 32x32x3 numbers
(3072 numbers total)
W
parameters
or weights

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 49 April 1, 2021


Parametric Approach: Linear Classifier
3072x1

Image
f(x,W) = Wx
10x1 10x3072
10 numbers giving
f(x,W) class scores
Array of 32x32x3 numbers
(3072 numbers total)
W
parameters
or weights

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 50 April 1, 2021


Parametric Approach: Linear Classifier
3072x1

Image
f(x,W) = Wx + b 10x1
10x1 10x3072
10 numbers giving
f(x,W) class scores
Array of 32x32x3 numbers
(3072 numbers total)
W
parameters
or weights

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 51 April 1, 2021


Neural Network

Linear
classifiers

This image is CC0 1.0 public domain

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 52 April 1, 2021


Two young girls are Boy is doing backflip
playing with lego toy. on wakeboard

Construction worker in
Man in black shirt
orange safety vest is
is playing guitar. Karpathy and Fei-Fei, “Deep Visual-Semantic Alignments for Generating Image Descriptions”, CVPR 2015
working on road. Figures copyright IEEE, 2015. Reproduced for educational purposes.

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 53 April 1, 2021


[Krizhevsky et al. 2012] Linear layers

[He et al. 2015]

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 54 April 1, 2021


Recall CIFAR10

50,000 training images


each image is 32x32x3

10,000 test images.

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 55 April 1, 2021


Example with an image with 4 pixels, and 3 classes (cat/dog/ship)

Flatten tensors into a vector

56

56 231
231

24 2
24

Input image
2

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 56 April 1, 2021


Example with an image with 4 pixels, and 3 classes (cat/dog/ship)

Flatten tensors into a vector

56
0.2 -0.5 0.1 2.0 1.1 -96.8 Cat score
56 231
231

24 2
1.5 1.3 2.1 0.0
24
+ 3.2
= 437.9 Dog score

0 0.25 0.2 -0.3 -1.2 61.95 Ship score


Input image
2
W b

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 57 April 1, 2021


Interpreting a Linear Classifier

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 58 April 1, 2021


Interpreting a Linear Classifier: Visual Viewpoint

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 59 April 1, 2021


Interpreting a Linear Classifier: Geometric Viewpoint

f(x,W) = Wx + b

Array of 32x32x3 numbers


(3072 numbers total)

Plot created using Wolfram Cloud Cat image by Nikita is licensed under CC-BY 2.0

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 60 April 1, 2021


Hard cases for a linear classifier
Class 1: Class 1: Class 1:
First and third quadrants 1 <= L2 norm <= 2 Three modes

Class 2: Class 2: Class 2:


Second and fourth quadrants Everything else Everything else

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 61 April 1, 2021


f(x,W) = Wx + b
Coming up:
(quantifying what it means to
- Loss function have a “good” W)
(start with random W and find a
- Optimization W that minimizes the loss)

- ConvNets! (tweak the functional form of f)

Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 2 - 62 April 1, 2021

You might also like