Deep Learning Models
Deep Learning Models
2012-05-03
Byoung-Hee Kim
Biointelligence Lab, CSE,
Seoul National University
NOTE: most slides are from talks of Geoffrey Hinton, Andrew Ng, and Yoshua Bengio.
Input
output
target
Two!
Historical background:
First generation neural networks
Perceptrons (~1960)
used a layer of handcoded features and tried
to recognize objects by
learning how to weight
these features.
There was a neat
learning algorithm for
adjusting the weights.
But perceptrons are
fundamentally limited in
what they can learn to
do.
Bomb
Toy
non-adaptive
hand-coded
features
input units
e.g. pixels
Sketch of a typical
perceptron from the 1960s
10
outputs
hidden
layers
input vector
(C) 2012, SNU Biointelligence Lab, http://bi.snu.ac.kr/
11
But, finding any model with deep architecture was not successful till 2006
(C) 2012, SNU Biointelligence Lab, http://bi.snu.ac.kr/
12
http://www.iro.umontreal.ca/~pift6266/H10/notes/deepintro.html
(C) 2012, SNU Biointelligence Lab, http://bi.snu.ac.kr/
13
14
15
16
17
Agenda
Computer Perception
References
Appendix
18
19
Feature Learning
pixel 1
Learning
algorithm
Input
pixel 2
pixel 2
Input space
Motorbikes
Non-Motorbikes
pixel 1
(C) 2012, SNU Biointelligence Lab, http://bi.snu.ac.kr/
20
Feature Learning
handle
wheel
Feature
Extractor
Learning
algorithm
Input
Feature space
pixel 2
handle
Input space
Motorbikes
Non-Motorbikes
pixel 1
(C) 2012, SNU Biointelligence Lab, http://bi.snu.ac.kr/
wheel
21
Low-level
vision features
Recognition
Audio
classification
Audio
Low-level
audio features
Speaker
identification
Helicopter
control
Helicopter
(C) 2012, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Low-level state
features
Action
22
Learning representations
Sensor
Feature
Representation
Learning
algorithm
23
SIFT
HoG
Textons
Spin image
RIFT
GLOH
24
Audio features
MFCC
Spectrogram
Flux
ZCR
Rolloff
25
26
Auditory cortex
learns to see.
Auditory
Cortex
(C) 2012, SNU Biointelligence Lab, http://bi.snu.ac.kr/
27
28
Unlabeled images
Learning
algorithm
Feature representation
(C) 2012, SNU Biointelligence Lab, http://bi.snu.ac.kr/
29
30
31
32
33
34
35
36
37
38
39
40
41
p(si 1)
The probability of
turning on is
determined by the
weighted input
from other units
(plus a bias)
p( si 1)
0
0
bi s j w ji
j
1
1 exp(bi s j w ji )
j
42
Binary
Stochastic
Neuron
43
44
45
46
47
48
10 label
neurons
500 neurons
500 neurons
28 x 28
pixel
image
49
http://www.cs.toronto.edu/~hinton/digits.html
50
51
Gabor functions.
[Images from DeAngelis, Ohzawa & Freeman, 1995]
Natural Images
50
100
150
200
50
250
100
300
150
350
200
400
250
50
300
100
450
500
50
100
150
200
350
250
300
350
400
450
150
500
200
400
250
450
300
500
50
100
150
350
200
250
300
350
100
150
400
450
500
400
450
500
50
200
250
300
350
400
450
500
Test example
0.8 *
x
0.8 *
+ 0.3 *
f36
+ 0.3 *
+ 0.5 *
f42
+ 0.5 *
f63
Supervised learning
Cars
Testing:
What is this?
Motorcycles
Semi-supervised learning
Testing:
What is this?
Car
Motorcycle
Self-taught learning
Testing:
What is this?
Car
Motorcycle
Self-taught learning
Sparse codin
g, LCC, etc.
f1, f2, , fk
Motorcycle
a1, a2, , ak
object parts
(combination
of edges)
edges
pixels
Cars
Elephants
Chairs
Input images
Samples from
feedforward
Inference
(control)
Samples from
Full posterior
inference
63
64
65
66
67
http://www.youtube.com/watch?v=VdIURAu1
-aU
http://www.youtube.com/watch?v=ZmNOAtZI
gIk&feature=relmfu
68
References
General Info on Deep Learning
http://deeplearning.net/
Review
69
References
Tutorials & Workshops
70