0% found this document useful (0 votes)
3 views

Lect11 Neural Nets2

Uploaded by

Parth Pandey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lect11 Neural Nets2

Uploaded by

Parth Pandey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Computer Vision

Neural Network:

(Before) Linear score function:

(Now) 2-layer Neural Network

x W1 h W2 s

3072 100 10

2
3
Classification

4
Preview [From recent Yann
LeCun slides]

5
ImageNet
(slide from Kaiming He’s recent presentation)
7
Working with CNNs in practice:
• Data augmentation
• Transfer learning
• Autoencoders

8
Data Augmentation

9
Classification

10
Data Augmentation

“cat”
Load
image and
label Compute
loss
CNN

11
Data Augmentation

“cat”
Load
image and
label Compute
loss
CNN

Transform
image
12
Data Augmentation

- Change the pixels without


changing the label
What the computer
sees
- Train on transformed data

- VERY widely used

13
Data Augmentation
1. Horizontal flips

14
Data Augmentation
2. Random crops/scales
Training: sample random crops / scales

15
Data Augmentation
2. Random crops/scales
Training: sample random crops / scales
ResNet:
1. Pick random L in range [256, 480]
2. Resize training image, short side = L
3. Sample random 224 x 224 patch

16
Data Augmentation
2. Random crops/scales
Training: sample random crops / scales
ResNet:
1. Pick random L in range [256, 480]
2. Resize training image, short side = L
3. Sample random 224 x 224 patch

Testing: average a fixed set of crops

17
Data Augmentation
2. Random crops/scales
Training: sample random crops / scales
ResNet:
1. Pick random L in range [256, 480]
2. Resize training image, short side = L
3. Sample random 224 x 224 patch

Testing: average a fixed set of crops


ResNet:
1. Resize image at 5 scales: {224, 256, 384, 480, 640}
2. For each size, use 10 224 x 224 crops: 4 corners + center, + flips
18
Data Augmentation
3. Color jitter
Simple:
Randomly jitter contrast

19
Data Augmentation
Complex:
3. Color jitter
Simple: 1. Apply PCA to all [R, G, B]
Randomly jitter contrast pixels in training set

2. Sample a “color offset” along


principal component
directions
1. Add offset to all pixels of a
training image
(As seen in [Krizhevsky et al. 2012], ResNet,
etc)
20
Data Augmentation
4. Get creative!

Random mix/combinations of :
- translation
- rotation
- stretching
- shearing,
- lens distortions, … (go crazy)

21
Data Augmentation: Takeaway

• Simple to implement, use it


• Especially useful for small datasets
• Fits into framework of noise / marginalization

22
Transfer Learning

“You need a lot of a data if you want to train/use


CNNs”

23
Transfer Learning

“You need a lot of a data if you want to train/use


CNNs”

24
Transfer Learning with CNNs
1. Train on
Imagenet

25
Transfer Learning with CNNs
2. Small dataset:
1. Train on feature extractor
Imagenet

Freeze
these

Train
this
26
Transfer Learning with CNNs
2. Small dataset: 3. Medium dataset:
1. Train on feature extractor finetuning
Imagenet
more data = retrain
more of the network
(or all of it)
Freeze
Freeze these
these

Train
this
Train
this
27
Transfer Learning with CNNs
2. Small dataset: 3. Medium dataset:
1. Train on feature extractor finetuning
Imagenet
more data = retrain
more of the network
(or all of it)
Freeze
Freeze these
tip: use only ~1/10th
these of the original
learning rate in
finetuning top layer,
and ~1/100th on
Train
intermediate layers
this
Train
this
28
CNN Features off-the-shelf: an Astounding Baseline for Recognition
[Razavian et al, 2014]

DeCAF: A Deep
Convolutional
Activation Feature for
Generic Visual
Recognition
[Donahue*, Jia*, et
al., 2013]

29
very similar very different
dataset dataset
more generic

very little data ? ?


more specific

quite a lot of data ? ?

30
very similar very different
dataset dataset
more generic

very little data Use Linear ?


Classifier on top
more specific
layer

quite a lot of data Finetune a few ?


layers

31
very similar very different
dataset dataset
more generic

very little data Use Linear You’re in trouble…


Classifier on top Try linear
more specific
layer classifier from
different stages

quite a lot of data Finetune a few Finetune a larger


layers number of layers

32
Overview
Caffe Torch Theano TensorFlow

Language C++, Python Lua Python Python

Pretrained Yes ++ Yes ++ Yes (Lasagne) Inception

Multi-GPU: Yes Yes Yes Yes


Data parallel cunn.DataParallelTable platoon

Multi-GPU: No Yes Experimental Yes (best)


Model fbcunn.ModelParallel

parallel
Readable Yes (C++) Yes (Lua) No No
source code
33
Good at RNN No Mediocre Yes Yes (best)
Supervised vs Unsupervised
Supervised Learning
Data: (x, y)
x is data, y is label

Goal: Learn a function to map


x -> y
Examples: Classification, regression,
object detection, semantic
segmentation, image captioning, etc

34
Supervised vs Unsupervised
Supervised Learning Unsupervised Learning
Data: (x, y) Data: x
x is data, y is label Just data, no labels!

Goal: Learn a function to map Goal: Learn some structure of


x -> y the data
Examples: Classification, regression, Examples: Clustering, dimensionality
object detection, etc reduction, feature learning, etc.

35
Unsupervised Learning
• Autoencoders
• Traditional: feature learning

36
Autoencoders

Features z
Encoder

Input data x

38
Autoencoders
Originally: Linear + nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN

Features z
Encoder

Input data x

39
Autoencoders
Originally: Linear + nonlinearity (sigmoid)
z usually smaller than x
Later: Deep, fully-connected
(dimensionality reduction)
Later: ReLU CNN

Features z
Encoder

Input data x

40
Autoencoders
Reconstructed
input data
xx
Decoder

Features z
Encoder

Input data x

41
Originally: Linear +

Autoencoders nonlinearity (sigmoid)


Later: Deep, fully-connected
Later: ReLU CNN (upconv)

Reconstructed
input data
xx
Decoder Encoder: 4-layer conv
Decoder: 4-layer upconv
Features z
Encoder

Input data x

42
Originally: Linear +

Autoencoders nonlinearity (sigmoid)


Later: Deep, fully-connected
Later: ReLU CNN (upconv)

Reconstructed
input data
xx
Encoder / decoder Decoder Train for
sometimes share reconstruction
weights Features z with no labels!

Example:
dim(x) = D Encoder
dim(z) = H
we: H x D Input data x
wd : D x H = weT

43
Autoencoders Loss function
(Often L2)

Reconstructed
input data
xx
Decoder Train for
reconstruction
Features z with no labels!

Encoder

Input data x

44
Autoencoders
Reconstructed
input data
xx
After training, Decoder
throw away
decoder! Features z
Encoder

Input data x

45
Autoencoders Loss function
(Softmax, etc)
bird plan
Predicted Label yy y dog deer e truc
Use encoder to k
initialize a Classifier
supervised Train for final task
Fine-tune
model
Features z encoder (sometimes with
jointly with small data)
classifier
Encoder

Input data x

46
Autoencoders
Autoencoders can
reconstruct data, and
Reconstructed
xx can learn features to
input data
initialize a supervised
Decoder
model
Features z
Encoder

Input data x

47

You might also like