DL Module 1 - CS-1 Fundamentals of Neural Network
DL Module 1 - CS-1 Fundamentals of Neural Network
Module 1
BITS Pilani
The course owner and the author of this course deck,
Prof. Seetha Parameswaran, is gratefully acknowledging
the authors who made their course materials freely
available online.
2
Course Logistics
3
What we Learn…. (Module Structure)
4
Books
Textbook
● Dive into Deep Learning by Aston Zhang, Zack C. Lipton, Mu Li, Alex
J. Smola. https://d2l.ai/chapter_introduction/index.html
Reference book
● Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville
https://www.deeplearningbook.org/
5
Evaluation Components and Schedule
6
Lab Sessions
● L1 Introduction to PyTorch
● L2 Deep Neural Network with Back-propagation and optimization
● L3 CNN
● L4 RNN
● L5 LSTM
● L6 Auto-encoders
7
Course Logistics
All submissions for graded components must be the result of your original
effort. It is strictly prohibited to copy and paste verbatim from any sources,
whether online or from your peers. The use of unauthorized sources or
materials, as well as collusion or unauthorized collaboration to gain an unfair
advantage, is also strictly prohibited. Please note that we will not distinguish
between the person sharing their resources and the one receiving them for
plagiarism, and the consequences will apply to both parties equally.
11
Definitions of Deep Learning
13
AI - ML - DL
16
Why Deep Learning?
18
Deep Learning Timeline
19
Applications of Deep Learning
20
Breakthroughs with Neural Networks
21
Breakthroughs with Neural Networks
22
Breakthroughs with Neural Networks
23
Breakthroughs
Breakthroughs with Neural Networks
with Neural Networks
24
Breakthroughs with Neural Networks
25
Breakthroughs
Breakthroughs with Neural Networks
with Neural Networks
26
Applications of Deep Learning
27
Many more applications….
29
Core components of DL problem
30
1. Data
● Collection of examples.
● Data has to be converted to an useful and a suitable numerical
representation.
● Each example (or data point, data instance, sample) typically consists
of a set of attributes called features (or covariates), from which the
model must make its predictions.
● In the supervised learning problems, the attribute to be predicted is
designated as the label (or target).
● Mathematically, a set of m examples,
● We need right data.
31
1. Data
● Dimensionality of data
○ Each example has the same number of numerical values. This data consist of
fixed-length vectors. Eg: Image
○ The constant length of the vectors as the dimensionality of the data.
○ Text data has varying-length data.
32
2. Model
33
3. Objective Function
34
3. Loss Functions
35
3. Loss Functions
37
Example of the Framework
38
Example of the Framework
● Create a Model
○ Define a flexible program whose behavior is determined by a number of
parameters.
○ To determine the best possible set of parameters, use the data. The parameters
should improve the performance of the program with respect to some measure of
performance on the task of interest.
○ After fixing the parameters, we call the program a model.
■ Eg: The model receives a snippet of audio as input, and the model generates
a selection among yes, no as output.
○ The set of all distinct programs (input-output mappings) that we can produce just
by manipulating the parameters is called a family of models.
■ Eg: We expect that the same model family should be suitable for ”Alexa”
recognition and ”Hey Siri” recognition because they seem, intuitively, to be
similar tasks.
39
Example of the Framework
40
Reading from TB Dive into Deep Learning
● Chapter 1
● Chapter 2 for Python Prelims, Linear Algebra,
Calculus, Probability
41
Artificial Neural Network
42
What are Neural Networks?
43
What are Neural Networks?
45
Brain: Interconnected Neurons
46
Biological Neuron
47
Connectionist Machines
48
What are Artificial Neurons?
49
Properties of Artificial Neural Nets (ANNs)
50
When to consider Neural Networks?
52
Perceptron
53
Representing Logic Gates using Perceptron
54
NOT Logic Gate
Question:
● How to represent NOT gate using a perceptron?
● What are the parameters for the NOT perceptron?
● Data is given below.
55
Perceptron for NOT gate
56
AND Logic Gate
Question:
● How to represent AND gate using a perceptron?
● What are the parameters for the AND perceptron?
● Data is given below.
57
Perceptron for AND gate
58
Perceptron for AND gate
59
Exercise
60
Perceptron for OR gate
61
Perceptron for OR gate
62
Perceptron Learning Algorithm
65
Perceptron Learning Algorithm
66
Convergence of Perceptron Learning Algorithm
It can be proved that the algorithm will converge
● If training data is linearly separable.
● Learning rate is sufficiently small
○ The role of the learning rate is to moderate the degree to which weights are
changed at each step.
○ It is usually set to some small value (e.g., 0.1) and is sometimes made to decay as
the number of weight-tuning iterations increases.
67
Perceptron Learning Algorithm for NOT gate
68
Perceptron Learning Algorithm for NOT gate
w1=w0=0, =1
X1 t W1 W0 z h Isequal(t,h) w New w
X1 t W1 W0 z h Isequal(t,h) w New w
https://colab.research.google.com/drive/1DUVcOoUIWhl8GQKc6AWR1wi0La
MeNkgD?usp=sharing
Student pl note:
The Python notebook is shared for anyone who has access to the link and the
access is restricted to use BITS email id. So pl do not access from non-BITS email
id and send requests for access.
70
Exercise
71
Representational Power of Perceptrons
72
Single Layer Perceptron
73
Example of Linearly Separable data
74
Example of Linearly Separable data
75
Example of Linearly Separable data
76
Non-Linearly Separable Data
77
Non-linearly Separable Data
78
Perceptron for XOR Gate
79
Solution for XOR
80
MLP for XOR
81
Ref:
Chapter 4 of Book: Machine Learning by
Tom M. Mitchell
82
Thank You All !