Coursera Machine Learning Course Week 6 - Slides

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Deep Learning:

Searching for Images


Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Visual product recommender

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


I want to buy new shoes, but…

Too many
options online…

3   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Text search doesn’t help…

“Dress shoes”

4   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Visual product search demo

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Features are key to
machine learning

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Goal: revisit classifiers, but using more
complex, non-linear features

Sentence
Classifier
from
review MODEL
Output: y
Input: x
Predicted class

7   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Image classification

Input: x Output: y
Image pixels Predicted object
8   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Neural networks
ê
Learning *very*
non-linear features

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Linear classifiers
Score(x) = w0 + w1 x1 + w2 x2 + … + wd xd

Score(x) > 0 Score(x) < 0


=0
d xd
2 x2 + …
+w
x1 + w
1
w0 + w

10   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Graph representation of classifier:
useful for defining neural networks
Input Output
1
Score(x) =
x1 w1 w0 + w1 x1 + w2 x2 + … + wd xd

w2
x2 y

> 0, output 1  
xd < 0, output 0  

11   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


What can a linear classifier represent?
x1 OR x2 x1 AND x2

1 1

x1 y x1 y

x2 x2

12   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


What can’t a simple linear classifier
represent?

XOR
the counterexample
to everything

Need non-linear features

13   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Solving the XOR problem:
Adding a layer
XOR = x1 AND NOT x2 OR NOT x1 AND x2
z1 z2
1 -0.5 1 -0.5

1 -0.5 1
x1 z1 y
-1
1
-1
1
x2 z2

Thresholded to 0 or 1
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
A neural network
•  Layers and layers and layers of
linear models and non-linear transformations
1 1

x1 z1 y

x2 z2

•  Around for about 50 years


-  Fell in “disfavor” in 90s
•  In last few years, big resurgence
-  Impressive accuracy on
several benchmark problems
-  Powered by huge datasets, GPUs,
& modeling/learning alg improvements
15   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Application of deep learning
to computer vision

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Image features
•  Features = local detectors
-  Combined to make prediction
-  (in reality, features are more low-level)

Nose

Eye
Face!

Eye

Mouth
17   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Typical local detectors look for
locally “interesting points” in image
•  Image features: collections of
locally interesting points
- Combined to build classifiers

Face!

18   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Many hand created features exist
for finding interest points…

• Spin Images
[Johnson & Herbert ‘99]
• Textons
[Malik et al. ‘99]
• RIFT
[Lazebnik ’04]
• GLOH
[Mikolajczyk & Schmid ‘05]
• HoG
[Dalal & Triggs ‘05]

SIFT [Lowe ‘99] • …

19   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Standard image
classification approach

Input Extract features Use simple classifier


e.g., logistic regression, SVMs

Hand-created
Face?
features

20   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Many hand created features exist
for finding interest points…
• Spin Images
[Johnson & Herbert ‘99]
• Textons
[Malik et al. ‘99]
Hand-created • RIFT
[Lazebnik ’04]
features
• GLOH
[Mikolajczyk & Schmid ‘05]
• HoG
[Dalal & Triggs ‘05]

SIFT [Lowe ‘99] • …

… but very painful to design


21   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Deep learning:
implicitly learns features

Layer 1 Layer 2 Layer 3 Prediction

Example
detectors
learned

Example
interest
points
detected

22   [Zeiler & Fergus ‘13]


©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Deep learning performance

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Sample results using deep neural networks

•  German traffic sign


recognition benchmark
-  99.5% accuracy (IDSIA team)

•  House number recognition


-  97.8% accuracy per character
[Goodfellow et al. ’13]

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


ImageNet 2012 competition:
1.2M training images, 1000 categories
Top 3 teams
0.3
Error (best of 5 guesses)

0.25
Huge
0.2
gain
0.15

0.1

0.05

0
SuperVision ISI OXFORD_VGG

Exploited hand-coded features like SIFT


©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
ImageNet 2012 competition:
1.2M training images, 1000 categories
Winning entry: SuperVision
8 layers, 60M parameters [Krizhevsky et al. ’12]

Achieving these amazing results required:


•  New learning algorithms
•  GPU implementation
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Deep learning in computer vision

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Scene parsing with deep learning

[Farabet et al. ‘13]

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Retrieving similar images
Input Image Nearest neighbors

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Challenges of deep learning

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Deep learning score card
Pros
•  Enables learning of features
rather than hand tuning

•  Impressive performance gains


-  Computer vision
-  Speech recognition
-  Some text analysis

•  Potential for more impact

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Deep learning workflow Adjust
parameters,
network
architecture,…
Learn
Training deep
set neural net
Lots of
labeled
data
Validation
Validate
set

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Many tricks needed to work well…
Different types of layers, connections,…
needed for high accuracy

[Krizhevsky et al. ’12]

33   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Deep learning score card
Pros Cons
•  Enables learning of features •  Requires a lot of data for
rather than hand tuning high accuracy
•  Computationally
•  Impressive performance gains really expensive
-  Computer vision •  Extremely hard to tune
-  Speech recognition -  Choice of architecture
-  Some text analysis -  Parameter types
-  Hyperparameters
•  Potential for more impact -  Learning algorithm
-  …
Computational cost+ so many choices
=
incredibly hard to tune
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Deep features:

Deep learning
+
Transfer learning

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Standard image
classification approach

Input Extract features Use simple classifier


e.g., logistic regression, SVMs

Can we learn
features from Face?
Hand-created
data, even
featureswhen
we don’t have
data or time?
36   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Transfer learning: Use data from
one task to help learn on another
Old idea, explored for deep learning by Donahue et al. ’14 & others

Lots of data: Great


Learn
accuracy on
vs. neural net
cat v. dog

Some data: Neural net as


feature extractor Great
+
accuracy on
101 categories
Simple classifier
37   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
What’s learned in a neural net

Neural net trained for Task 1: cat vs. dog

vs.

More generic Very specific


Can be used as feature extractor to Task 1
Should be ignored
for other tasks
38   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Transfer learning in more detail…
For Task 2, predicting 101 categories,
learn only end part of neural net

Neural net trained for Task 1: cat vs. dog

Keep weights fixed! Class?

Use simple classifier


e.g., logistic regression,
SVMs, nearest neighbor,…

More generic Very specific


Can be used as feature extractor to Task 1
Should be ignored
for other tasks
39   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Careful where you cut:
latter layers may be too task specific
Too specific
Use these! for new task

Layer 1 Layer 2 Layer 3 Prediction

Example
detectors
learned

Example
interest
points
detected

40   [Zeiler & Fergus ‘13]


©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Transfer learning with deep features workflow

Learn
Training
Extract simple
set
features classifier
Some with
labeled neural net
data trained on
different
task Validation
Validate
set
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
How general are deep features?

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Summary of deep learning

©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


What you can do now…
•  Describe multi-layer neural network models
•  Interpret the role of features as local detectors
in computer vision
•  Relate neural networks to hand-crafted image
features
•  Describe some settings where deep learning
achieves significant performance boosts
•  State the pros & cons of deep learning model
•  Apply the notion of transfer learning
•  Use neural network models trained in one
domain as features for building a model in
another domain
•  Build an image retrieval tool using deep features

44   ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

You might also like