001 Intro
001 Intro
http://bit.ly/DLSP20
Yann LeCun
NYU - Courant Institute & Center for Data Science
Facebook AI Research
http://yann.lecun.com
TAs: Alfredo Canziani, Mark Goldstein
Course information
Website:
http://bit.ly/DLSP20
TA: Alfredo Canziani & Mark Goldstein
Lectures:
9 lectures by YLC
3 guest lectures
Practical session
Tuesday evenings with Alfredo
Evaluation
Mid-term exam
Final project (on self-supervised learning & autonomous driving)
Y. LeCun
Self-supervised learning
Contrastive methods and Regularization methods for energy shaping
Accelerated inference: encoder, LISTA, VAE
Denoising AE, variational AE, contrastive divergence….
Generative adversarial Networks
SSL and beyond
How does human and animal learning work?
How do we get to human-level AI?
Building models of the world for control
Y. LeCun
Supervised Learning
Training a machine by showing examples instead of programming it
When the output is wrong, tweak the parameters of the machine
Works well for:
Speech→words
Image→categories
Portrait→ name
Photo→caption
CAR
Text→topic
…. PLANE
Y. LeCun
https://youtu.be/X1G2g3SiCwU
Y. LeCun
Feature Trainable
Extractor Classifier
Feature Trainable
Extractor Classifier
Trainable
Deep Learning
Weight
matrix
Hidden
Layer
Y. LeCun
Function with
adjustable parameters
Objective
Function Error
traffic light: -1
It's like walking in the mountains in a fog
and following the direction of steepest
descent to reach the village in the valley
But each sample gives us a noisy
estimate of the direction. So our path is
a bit random. ∂ L( W , X )
W i ←W i −η
∂Wi
Stochastic Gradient Descent (SGD)
Y. LeCun
Y (desired output)
X (input)
Y. LeCun
pooling
Multiple subsampling
[Fukushima 1982][LeCun 1989, 1998],[Riesenhuber 1999]...... convolutions
Y. LeCun
Pooling
Filter Bank +non-linearity
Pooling
[Osadchy,Miller LeCun JMLR 2007],[Kavukcuoglu et al. NIPS 2010] [Sermanet et al. CVPR 2013]
Y. LeCun
Depth inflation
VGG
[Simonyan 2013]
GoogLeNet
Szegedy 2014]
ResNet
[He et al. 2015]
DenseNet
[Huang et al 2017]
Y. LeCun
ResNet50 and
ResNet100 are used
routinely in
production.
Feature visualization of convolutional net trained on ImageNet from [Zeiler & Fergus 2013]
Y. LeCun
Mask-RCNN
[He et al. arXiv:1703.06870]
ConvNet produces an object mask
for each region of interest
RetinaNet/FPN
[Lin et al. ArXiv:1708.02002]
one-pass object detection
Y. LeCun
MobilEye
(2015)
NVIDIA
Y. LeCun
Applications of ConvNets
Self-driving cars, visual perception
Medical signal and image analysis
Radiology, dermatology, EEG/seizure prediction….
Bioinformatics/genomics
Speech recognition
Language translation
Image restoration/manipulation/style transfer
Robotics, manipulation
Physics
High-energy physics, astrophysics
New applications appear every day
E.g. environmental protection,….
Y. LeCun
Feature visualization of convolutional net trained on ImageNet from [Zeiler & Fergus 2013]
Y. LeCun
Trainable Trainable
Feature Extractor Classifier
Y. LeCun
Basic principle:
expanding the dimension of the representation so that things are more
likely to become linearly separable.
- space tiling
- random projections
- polynomial classifier (feature cross-products)
- radial basis functions
- kernel machines
Y. LeCun
Hierarchical representation
Hierarchy of representations with increasing level of abstraction
Each stage is a kind of trainable feature transform
Image recognition
Pixel → edge → texton → motif → part → object
Text
Character → word → word group → clause → sentence → story
Speech
Sample → spectral band → sound → … → phone → phoneme → word
Y. LeCun
[ ]
Ideal
−3 Pose
Feature
0.2 Lighting
Extractor −2 .. . Expression
-----
Y. LeCun
View
Pixel n
Ideal
Feature
Extractor
Pixel 2
Expression
Pixel 1
Data Manifold & Invariance: Y. LeCun
Pooling
Non-Linear
Or
Function
Aggregation
Input Stable/invariant
high-dim
features
Unstable/non-smooth
features
Y. LeCun