Fundamental - Deep Learning
Fundamental - Deep Learning
DEEP LEARNING
CS617: Basics of Deep Learning
Instructor: Aparajita Ojha
Slide1
About the Course
• 4 credit Course
• Grading policy:
– Relative grading
– Quiz-10%, Project 20%, Midesem-30%, Endsem-40%
– Academic honesty: Grade F if used any kind of unfair means in
any of the quiz\ test\ project.
• Course details: available on my course page
web.iiitdmj.ac.in/~aojha
Reference: Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun, 2015 ; Fei Fei Li, Lecture Slides, Introduction to CNN
9 CS617 Basics of Deep Learning 1/18/2019
Deep Learning Today
source:developer.nvidia.com/deep--‐learning--‐courses
10 CS617 Basics of Deep Learning 1/18/2019
Introduction
• What is deep learning ?
• “ A machine learning technique that allows computers to
improve with experience and data”.
• “Deep learning is a particular kind of machine learning
that achieves great power and flexibility by learning to
represent the world as a nested hierarchy of concepts,
with each concept defined in relation to simpler concepts,
and more abstract representations computed in terms of
less abstract ones”.
– From the Book on Deep Learning by Ian Goodfellow, Yoshua
Bengio, Aaron Courville, MIT Press, 2017.
Machine Learning
Data
algorithm
Training
Prediction
Figure Source: Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville, MIT Press, 2017.
13 CS617 Basics of Deep Learning 1/18/2019
Machine Learning Algorithm
• A machine learning algorithm is an algorithm that is
able to learn and extract patterns from data.
• Learning –
– A computer program is said to learn from
• experience E
• with respect to some class of tasks T
• and performance measure P
– if its performance at tasks in T , as measured by P ,
improves with experience E.” (Mitchell , 1997)
Semi-
supervised
learning
• Unsupervised Learning
– observing several examples of a random vector x, and attempting to
implicitly or explicitly learn the probability distribution p(x), or some
interesting properties of that distribution
– there is no instructor and the algorithm must learn to make sense of
the data without this guide.
• Reinforcement learning
– Such algorithms interact with an environment.
– There is a feedback loop between the learning system and its
experiences.
Many machine
learning problems
become
exceedingly difficult
when the number
of dimensions in
the data is high.
hand-crafted
Trainable
feature output
extractor system
Figure from: Zeiler and Fergus, 2013: Feature visualization of convolution NN on ImageNet database.
24 CS617 Basics of Deep Learning 1/18/2019
Machine Learning vs Deep Learning
• 𝒘 = 𝒘 + 𝒙 𝑦 ( is learning rate )
Activation function
𝑥1
𝑤1
𝑥2 𝑤2
f(x) output
𝑤3
-2.5 -1.6
(x)
0.002
Weighted sum
1.4 x = -0.06×2.7 + 2.5×8.6 + 1.4×0.002 = 4.0048
𝜎 𝑥 = 0.982
35 CS617 Basics of Deep Learning 1/18/2019
Modelling ‘AND’
• Let us see how does a neuron model reconstructs ‘AND’
operation.
x1 x2 Output 𝑤1 𝑥1 + 𝑤2 𝑥2 + 𝑏 = 𝐿𝑎𝑏𝑒𝑙
(Label )
0 0 0 𝑤1 . 0 + 𝑤2 . 0 + 𝑏 = 0
0 1 0 𝑤1 . 0 + 𝑤2 . 1 + 𝑏 = 0
1 0 0 𝑤1 . 1 + 𝑤2 . 0 + 𝑏 = 0
1 1 1 𝑤1 . 1 + 𝑤2 . 1 + 𝑏 = 1
Can we suitably adjust weight and
𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 bias parameters so that all four
1 𝑧≥𝜃 equations are satisfied ?
𝑓 𝑧 = ൜
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Yes, for example if 𝜃 = 0.5, choose 𝑤1 = 0.3 = 𝑤2 , 𝑏 = 0.
36 CS617 Basics of Deep Learning 1/18/2019
General Perceptron Algorithm …
• Dealing with XOR: (Minsky, Papert, 1969)
Figure from: Prof. Efstratios Gavves’s lecture slides on UVA Deep Learning Course
37 CS617 Basics of Deep Learning 1/18/2019
Limitations and Slow Down
• Single layer of perceptron are not able to solve
complex tasks.
– Linear decision boundaries.
f1 f2 f3 f4
• The model is represented with a directed
acyclic graph describing how the functions
are composed together.
• The function
𝑓 𝑥 = 𝑓 4 (𝑓 3 (𝑓 2 (𝑓 1 (𝑥))))
tends to approximate 𝑓 ∗ .
. . .
Hidden Layer(s)
. . .
Input Layer . . .
x1 x2 xm
42 CS617 Basics of Deep Learning
Input sample x 1/18/2019
Pattern Classification
• Function: x y
output pattern y
y1 y2 yn
• The NN’s output is used to
recognize different input patterns. . . .
• Different output patterns
correspond to particular classes of . . .
input patterns.
• Networks with hidden layers can . . .
be used for solving more complex
problems than just a linear pattern . . .
classification.
x1 x2 xm
input pattern x
43 CS617 Basics of Deep Learning 1/18/2019
Generalization
• By proper training, a neural network
may produce reasonable answers for
input patterns not seen during y2pattern 𝑦ොyn
yOutput
1
training (generalization).
. . .
. . .
x1 x2 xm
44 CS617 Basics of Deep Learning
Input pattern x 1/18/2019
An Example
Train the NN
Initialise with random weights
Features class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Train the NN
Initialise with random weights
Present a training pattern and feed it to the NN.
Get an output and compare with actual value.
Training Data
Features class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Train the NN
Initialise with random weights
Present a training pattern and feed it to the NN.
Get an output and compare with actual value.
Training Data
Features class 1.4
1.4 2.7 1.9 0
3.8 3.4 3.2 0 2.7 0.8
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.9
etc …
Train the NN
Initialise with random weights
Present a training pattern and feed it to the NN.
Get an output and compare with actual value.
Training Data
Features class 1.4
1.4 2.7 1.9 0
3.8 3.4 3.2 0 2.7 0.8
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.9
etc …
Expected output
49 CS617 Basics of Deep Learning 1/18/2019
ANN with 1 Hidden Layer…
Train the NN
Initialise with random weights
Present a training pattern and feed it to the NN.
Get an output and compare with actual value.
Training Data
Features class 1.4
1.4 2.7 1.9 0
3.8 3.4 3.2 0 2.7 0.8
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.9
etc …
Expected output Error = 0.8
50 CS617 Basics of Deep Learning 1/18/2019
ANN with 1 Hidden Layer…
Train the NN
Initialise with random weights
Present a training pattern and feed
Present a training pattern and feed it to the NN.
Get an output and compare with actual value.
Adjust and update the weights.
Training Data
Features class 1.4
1.4 2.7 1.9 0
3.8 3.4 3.2 0 2.7 0.8
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.9
etc …
Expected output Error = 0.8
51 CS617 Basics of Deep Learning 1/18/2019
ANN with 1 Hidden Layer…
Train the NN
Initialise with random weights
Present a training pattern and feed it to the NN.
Get an output and compare with actual value.
Adjust and update the weights.
Present a training pattern and feed it to the NN
Training Data
Features class 6.4
1.4 2.7 1.9 0
3.8 3.4 3.2 0 2.8 0.9
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.7
etc …
Training pattern Error = - 0.1
52 CS617 Basics of Deep Learning 1/18/2019
ANN with 1 Hidden Layer…
Initialise with random weights
Present a training pattern and feed it to the NN.
Get an output and compare with actual value.
Adjust and update the weights.
Present a training pattern and feed it to the NN.
Repeat the process to reduce the error.
Training Data
Features class 4.1
1.4 2.7 1.9 0
3.8 3.4 3.2 0 0.1
6.4 2.8 1.7 1
4.1 0.1 0.2 0 0.2
etc …
Next training pattern
53 CS617 Basics of Deep Learning 1/18/2019
Mathematical Model: ANN…
• Weight adjustment process is repeated thousands and
thousands of times.
• Each time a random training example is taken, weights
are slightly adjusted to tune the system for reducing
error of approximation.
• It may not be an efficient adjustment on many other
cases.
• But eventually the process of weight adjustments leads
to a good enough model for producing an effective
classifier.
• It works well in many real applications.
54 CS617 Basics of Deep Learning 1/18/2019
Further Advancements
• Back propagation
• Sophisticated algorithms
– Recurrent Long-Short Term Memory Networks
(Hochreiter and Schmidhuber in 1997 )
– Optical Character recognition using Convolutional
Neural Networks (Yann LeCun et al. late 1990s)
67 CS617 Basics of Deep Learning From: Prof. David Wolfe Corne’s lecture1/18/2019
slides
Summary
• Deep Learning Technologies have brought significant
improvement in performances of machines in various
complex tasks - machine translations, object detection, data
hiding etc.
• Deep learning frameworks have evolved from ANN based
machine learning methods.
• Deep learning differs from traditional ML methods in the way
network are trained to automatically detect features in a
hierarchical way- No need to manually select features!
• Various Deep learning architectures have been proposed over
the years.