Lecture 1
Lecture 1
● Group project
○ Two to three person team
○ Poster presentation and write-up
A Crash Course on Deep Learning
Elements of Machine Learning
Model
Objective
Training
What’s Special About Deep Learning
Compositional
Model
Image Modeling
Convolutional Nets
Language/Speech
Recurrent Nets
Image Modeling and Convolutional Nets
Breakthrough of Image Classification
Evolution of ConvNets
• LeNet (LeCun, 1998)
– Basic structures: convolution, max-pooling, softmax
• Alexnet (Krizhevsky et.al 2012)
– ReLU, Dropout
• GoogLeNet (Szegedy et.al. 2014)
– Multi-independent pass way (Sparse weight matrix)
• Inception BN (Ioffe et.al 2015)
– Batch normalization
• Residual net (He et.al 2015)
– Residual pass way
Fully Connected Layer
Output
Input
Convolution = Spatial Locality + Sharing
Spatial Locality
Without Sharing
With Sharing
Convolution with Multiple Channels
Source: http://cs231n.github.io/convolutional-networks/
Pooling Layer
Can be replaced by strided convolution
Source: http://cs231n.github.io/convolutional-networks/
LeNet (LeCun 1998)
• Convolution
• Pooling
• Flatten
• Fully connected
• Softmax output
AlexNet (Krizhevsky et.al 2012)
Challenges: From LeNet to AlexNet
● Overfitting prevention
○ Dropout regularization
• ReLU
• Why ReLU?
– Cheap to compute
– It is roughly linear..
Dropout Regularization
● Randomly zero out neurons with
probability 0.5
Dropout Mask
Dropout Regularization
● Randomly zero out neurons with
probability 0.5
Dropout Mask
GoogleNet: Multiple Pathways, Less Parameters
Vanishing and Explosive Value Problem
● Imagine each layer multiplies
Its input by same weight matrix
○ W > 1: exponential explosion
○ W < 1: exponential vanishing
• Subtract mean
• Divide by standard deviation
• Output is invariant to input scale!
– Scale input by a constant
– Output of BN remains the same
• Impact
– Easy to tune learning rate
– Less sensitive initialization
(Ioffe et.al 2015)
The Scale Normalization (Assumes zero mean)
Scale
Normalization
Invariance to
Magnitude!
Residual Net (He et.al 2015)
● http://dlsys.cs.washington.edu/materials
Lab1 on Thursday
● Walk through how to implement a simple model for digit recognition
using MXNet Gluon
● Focus is on data I/O, model definition and typical training loop
● Familiarize with typical framework APIs for vision tasks