0% found this document useful (0 votes)

13 views59 pages

Deep Learning

This document provides an overview of deep learning. It begins with an agenda that includes a syllabus, when deep learning is preferable to machine learning, fundamentals of neural networks, and takeaways. It then discusses the timeline of deep learning techniques including training and optimization, autoencoders, convolutional neural networks, recurrent neural networks, and recent trends. It also covers when deep learning should be used over machine learning, the learning architecture of deep neural networks for text and image use cases, and how neural networks can implement functions and overcome the curse of dimensionality.

Uploaded by

Fatema Momin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views59 pages

Deep Learning

Uploaded by

Fatema Momin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 59

DEEP LEARNING

Pranita Mahajan

Classification: Personal
AGENDA
Syllabus
When DL over ML
Fundamentals of Neural Network
Takeaway

Classification: Personal
SYLLABUS 3

Classification: Personal
4

TIMELINE
TRAINING, CONVOLUTIONAL
RECURRENT
OPTIMIZATION AUTOENCODERS NEURAL RECENT TRENDS
NEURAL
REGULARIZATIO : UNSUPERVISED NETWORKS AND
NETWORKS
N OF DEEP LEARNING (CNN): APPLICATIONS
(RNN)
NEURAL SUPERVISED
NETWORK LEARNING

Synergize scalable Disseminate standardized Coordinate e- Foster holistically Deploy strategic

e-commerce business applications superior methodologies networks with
metrics compelling e-
business needs

Classification: Personal
WHEN DL OVER ML
When to use DL for analysis

Classification: Personal
AI >> ML >> DL

MACHINE ML LIMITATION DEEP LEARNING DL PROBLEM

LEARNING SOLVING
Machine Learning is set ML algorithms lot of Subset of Machine
Each concept defined in
of algorithms that parse domain expertise, Learning that achieves
relation to simpler
data, learn from them, human intervention great power and
concepts, and more
and then apply what only capable of what flexibility by learning to
abstract representations
they’ve learned to make they’re designed for; represent the world as
computed in terms of
intelligent decisions. nothing more, nothing nested hierarchy of
less abstract ones
less. concepts

Classification: Personal
Classification: Personal
DL LEARNING 8

ARCHITECTURE
TEXT USECASE IMAGE USECASE
Deep learning technique learn In the example of image
categories incrementally through recognition it means identifying
it’s hidden layer architecture, light/dark areas before categorizing
defining low-level categories like lines and then shapes to allow face
letters first then little higher-level recognition.
categories like words and then Each neuron or node in the
higher-level categories like network represents one aspect of
sentences. the whole and together they
provide a full representation of the
image.
Each node or hidden layer is given
a weight that represents the
strength of its relationship with the
output and as the model develops
Classification: Personal
the weights are adjusted.
DL LEARNING 9

Classification: Personal
ARCHITECTURE OF 11

NEURAL NETWORK

Classification: Personal
HOW NN CAN BE USED 12

TO IMPLEMENT
FUNCTIONS
AND Function
- Input features – X1 and X2
- Output – y (AND function)
- 2-D Feature space representation

Can it be considered as binary classification problem

Classification: Personal
One of the many possible linear boundaries
How to construct a feature vector from this 13
linear equation
- Capture coefficients from linear line
separating two classes

One of the many possible linear

boundaries

Classification: Personal
Implementing AND using Neural 14

Network

Non Linear
function

AND Operation

Classification: Personal
How to design a neuron to build
AND operation / function

Hence, with single neuron with threshold non

linear function we can implement AND logic

Classification: Personal
16

Similarly OR Logic can be build

As AND & OR both are linearly

Separable so we could solve it with
single neuron but for problems
where X and y are non-linearly
Separable we need NN (MLP)
Classification: Personal
17

XOR FUNCTION
IMPLEMENTATION
• It is non-linear problem
• Can we break this non-linear problem
into combination of linear problem

XOR(X1, X2) = OR(X1,X2) AND NAND(X1,X2)

h1 = OR(X1,X2)
h2 = NAND(X1,X2)
XOR(X1,X2) = AND (h1,h2)

Classification: Personal
XOR FUNCTION 18

IMPLEMENTATION

Classification: Personal
Presentation title 19

XOR FUNCTION
Weight vectors of OR,
[-0.5 1 1] h1
Negation of AND is h2
NAND
[1.5 -1 -1]

Classification: Personal
HOW MANY NEURONS ARE
Presentation title 20

NEEDED
• We need 3 neurons
• 3 neurons to be arranged in Two layers
• 1st – OR operation
• 2nd – NAND operation
• 3rd in second layer AND operation

Classification: Personal
Presentation title 21

GENERAL NEURAL
NETWORK

In every layer it computes non-linear function

Cascade of Neural network gives overall function

In final layer (kth layer) solve this problem as linearly separable

Classification: Personal
MULTI LAYER PERCEPTRON
22

•
(MLP)
A multilayer perceptron (MLP) is a feedforward artificial neural network that generates
a set of outputs from a set of inputs.
• An MLP is characterized by several layers of input nodes connected as a directed graph
between the input and output layers.
• Feed Forward Network.

• The job of each layer is to take input from layer before it and pass it to next layer,
hence Feed Forward NN (Deep Neural Network).
• Every node in intermediate layer is connected to next layer node and the purpose is to
Classification: Personal
find the value of weight in next layer
BUILDING BLOCK OF DEEP
23

NEURAL NETWORKS
• Perceptron

Frank Rosenblatt’s model

Classification: Personal
Multi layer Perceptron 24

Frank Rosenblatt’s model

Classification: Personal
BUILDING BLOCK OF DEEP
25

NEURAL NETWORKS
• Sigmoid Neuron
Sigmoid neurons are similar to perceptrons, but they are slightly modified such that the output from
the sigmoid neuron is much smoother than the step functional output from perceptron.
Perceptron model takes several real-valued inputs and gives a single binary output.
In the perceptron model, every input xi has weight wi associated with it.
The weights indicate the importance of the input in the decision-making process.
The model output is decided by a threshold Wₒ if the weighted sum of the inputs is greater than threshold Wₒ
output will be 1 else output will be 0. In other words, the model will fire if the weighted sum is greater than the threshold.

Classification: Personal
Presentation title 26

• Let’s see how this harsh thresholding affects real world problem
• Red points indicates that a person would not buy a car and green points indicate that person would like to buy
a car. Isn’t it a bit odd that a person with 50.1K will buy a car but someone with a 49.9K will not buy a car?
The small change in the input to a perceptron can sometimes cause the output to completely flip, say from 0 to
1.
• This behavior is a characteristic of the perceptron neuron itself which behaves like a step function. We can
overcome this problem by introducing a new type of artificial neuron called a sigmoid neuron.
• One of the limitations of the perceptron model is that the learning algorithm works only if the data is linearly
separable

Classification: Personal
Presentation title 27

Sigmoid neurons where the output function is much smoother than the step function. In the sigmoid
neuron, a small change in the input only causes a small change in the output as opposed to the stepped
output. There are many functions with the characteristic of an “S” shaped curve known as sigmoid
functions. The most commonly used function is the logistic function.

Classification: Personal
Presentation title 28

SUMMARY
If the Feed Forward algorithm only computed the weighted sums in each neuron,
propagated results to the output layer, and stopped there, it wouldn’t be able to learn the
weights that minimize the cost function. If the algorithm only computed one iteration,
there would be no actual learning.
This is where Backpropagation comes into play.

Classification: Personal
Presentation title 29

THREE CLASSES OF DEEP

LEARNING
1. Basics of Neural Networks
2. Convolutional Neural Networks
3. Recurrent Neural Networks

Classification: Personal
Presentation title 30

• Forward Propagation
• Cost Function
• Gradient Descent
• Learning Rate
• Backpropagation

Classification: Personal
Presentation title 31

CONVOLUTIONAL NEURAL
NETWORK
Convolutional neural networks are basically applied
on image data. Suppose we have an input of size
(28*28*3), If we use a normal neural network, there
would be 2352(28*28*3) parameters. And as the size
of the image increases the number of parameters
becomes very large. We “convolve” the images to
reduce the number of parameters

Classification: Personal
Presentation title 32

RECURRENT NEURAL NETWORK

• Recurrent neural networks are used especially for sequential data where the previous output is used to predict the
next one. In this case the networks have loops within them. The loops within the hidden neuron gives them the
capability to store information about the previous words for some time to be able to predict the output.

Classification: Personal
Presentation title 33

CHAPTER 2

TRAINING, OPTIMIZATION AND

REGULARIZATION OF DEEP
NEURAL NETWORK

Classification: Personal
Presentation title 34

TRAINING FEEDFORWARD DNN

Classification: Personal
Presentation title 35

• Forward Propagation
• Cost Function
• Gradient Descent
• Learning Rate
• Backpropagation

Classification: Personal
Presentation title 36

GRADIENT DESCENT
• Designing and training a neural network is not much di fferent from training any other machine learning model
with gradient descent
• The largest difference between the linear models we have seen so far and neural networks is that the nonlinearity
of a neural network causes most interesting loss functions to become nonconvex.
• This means that neural networks are usually trained by using iterative, gradient-based optimizers that merely drive
the cost function to a very low value,
• For feedforward neural networks, it is important to initialize all weights to small random values.
• Training a neural network is not much di fferent from training any other model. Computing the gradient is slightly
• more complicated for a neural network but can still be done e fficiently and exactly.
• We will today learn the gradient using the back-propagation algorithm and modern generalizations of the back-
propagation algorithm.
• Back-propagation is an algorithm that computes the chain rule, with a specific order of operations that is highly
efficient.

Classification: Personal
Presentation title 37
• Optimizer algorithms are optimization method that helps improve a deep learning model’s
performance. These optimization algorithms or optimizers widely affect the accuracy and speed
training of the deep learning model.
• An optimizer is a function or an algorithm that adjusts the attributes of the neural network, such as
weights and learning rates. Thus, it helps in reducing the overall loss and improving accuracy. The
problem of choosing the right weights for the model is a daunting task, as a deep learning model
generally consists of millions of parameters.
• Gradient Descent can be considered the popular method among the class of optimizers.

Classification: Personal
Presentation title 38

LEARNING FACTORS
• The factors are as follows:
1. Initial weights
2. Steepness of activation function
3. Learing constant
4. Momentum
5. Network architecture
6. Necessary number of hidden neurons

Classification: Personal
1. Initial weights:
39
The weights of the network to be trained are typically initialized at small random values.
The initialization strongly affects the ultimate solution
2. Steepness of activation function
The neuron’s continuous activation function is characterized by its steepness factor
Also the derivative of the activation function serves as a multiplying factor in building components of the error signal vectors.
3. Learning constant:
The effectiveness and convergence of the error back propagation learning algorithm deepen significantly on the value of the
learning constant.
4. Momentum:
The purpose of the momentum method is to accelerate the convergence of the error back propagation learning algorithm.
The method involves supplementing the current weight adjustment with a fraction of the most recent weight adjustment.
5. Network architecture:
One of the most important attributes if a layered neural network design is choosing the architecture
The number of input nodes is simply determined by the dimension or size of the input vector to be classified. The input vector
size usually corresponds to the total number of distinct features of the input patterns.
6. Necessary number of hidden neurons:
This problem of choice of size of the hidden layer is under intensive study with no conclusive answers available.
One formula can be used to find out how many hidden layer neurons need to be used to achieve classification into M classes
in x dimensional patterns space.

Classification: Personal
Presentation title 40

ACTIVATION
FUNCTIONS
Linear
Logistic / Sigmoid
Tanh
ReLU
Leaky ReLU
Softmax

Classification: Personal
Presentation title 41

TYPES OF ACTIVATION
FUNCTION
• Activation functions are generally two types, These are
1. Linear or Identity Activation Function
2. Non-Linear Activation Function.

Generally, neural networks use non-linear activation functions, which can help the network learn complex data,
compute and learn almost any function representing a question, and provide accurate predictions.They allow
back-propagation because they have a derivative function which is related to the inputs.

Classification: Personal
Presentation title 42

Classification: Personal
Presentation title
SIGMOID/ LOGISTIC 43

• Sigmoid Activation function is very simple which takes a real value as input and gives probability that ‘s always
between 0 or 1. It looks like ‘S’ shape.
• Main advantage is simple and good for classifier. But Big disadvantage of the function is that it It gives rise to a
problem of “vanishing gradients” because Its output isn’t zero centered. It makes the gradient updates go too far in
different directions. 0 < output < 1, and it makes optimization harder. That takes very high computational time in
hidden layer of neural network

Classification: Personal
Presentation title 44
Tanh or Hyperbolic tangent

• Tanh help to solve non zero centered problem of sigmoid function. Tanh squashes a real-valued number to the
range [-1, 1]. It’s non-linear.
• Derivative function give us almost same as sigmoid’s derivative function.
• It can’t remove the vanishing gradient problem completely.

Classification: Personal
Presentation title 45

COMPARING TANH WITH SIGMOID

Classification: Personal
Presentation title 46

RELU
This is most popular activation function
which is used in hidden layer of NN.The
formula is deceptively simple:
𝑚𝑎𝑥(0,𝑧)max(0,z). Despite its name and
appearance, it’s not linear and provides the
same benefits as Sigmoid but with better
performance.

It’s main advantage is that it avoids and

rectifies vanishing gradient problem and less
computationally expensive than tanh and
sigmoid.
But it has also some draw back . Sometime
some gradients can be fragile during training
and can die. That leads to dead neurons.

Classification: Personal
Presentation title 47

COMPARING RELU WITH SIGMOID

Classification: Personal
Presentation title 48

• It prevents dying ReLU

LEAKY RELU problem.T his variation of ReLU
has a small positive slope in the
negative area, so it does enable
back-propagation, even for
negative input values

Classification: Personal
Presentation title 49

RELU WITH
LEAKY RELU
Leaky ReLU does not provide consistent
predictions for negative input values. During
the front propagation if the learning rate is set
very high it will overshoot killing the neuron.

Classification: Personal
Presentation title 50

SOFTMAX
Generally, we use the function at last layer of
neural network which calculates the
probabilities distribution of the event over ’n’
different events. The main advantage of the
function is able to handle multiple classes.

Classification: Personal
Presentation title 51

SIGMOID WITH Click icon to add picture

SOFTMAX
Let’s take an example
Sigmoid input values: -0.5, 1.2, -0.1, 2.4
Sigmoid output values: 0.37, 0.77, 0.48,
0.91
SoftMax input values: -0.5, 1.2, -0.1, 2.4
SoftMax output values: 0.04, 0.21, 0.05,
0.70

Sigmoid’s probabilities produced by a Sigmoid are independent. Furthermore, they

are not constrained to sum to one: 0.37 + 0.77 + 0.48 + 0.91 = 2.53. The reason for this is because
the Sigmoid looks at each raw output value separately.
Whereas Softmax’s the outputs are interrelated. The Softmax probabilities will always sum to one
by design: 0.04 + 0.21 + 0.05 + 0.70 = 1.00. In this case, if we want to increase the likelihood of
one class, the other has to decrease by an equal amount.

Classification: Personal
Presentation title 52

LOSS AND LOSS FUNCTIONS FOR

TRAINING DEEP LEARNING NEURAL
NETWORKS

Classification: Personal
Presentation title 53

WHAT WE WILL STUDY

• Regression Models
• Squared Error loss
• Classification Model
• Cross Entropy
• Choosing output function and loss function

Classification: Personal
Presentation title 54

SQUARED ERROR LOSS

• In mathematical optimization and decision theory, a loss or cost function (sometimes also called an error
function) is a function that maps an event or values of one or more variables onto a real number intuitively
representing some “cost” associated with the event.
• In simple terms, the Loss function is a method of evaluating how well your algorithm is modeling your dataset. It
is a mathematical function of the parameters of the machine learning algorithm.
• You can’t improve what you can’t measure. That’s why the loss function comes into the picture to evaluate how
well your algorithm is modeling your dataset.

Classification: Personal
Presentation title 55

Advantage
1. Easy to interpret.
2. Always differential because of the square.
3. Only one local minima.

Disadvantage
1. Error unit in the square. because the unit in the
square is not understood properly.
2. Not robust to outlier

Note – In regression at the last neuron use linear

activation function.

Classification: Personal
Presentation title 56

CROSS ENTROPY LOSS

• Binary
• It is used in binary classification problems like two classes. example a person has covid or not or my article gets popular or
not.
• Binary cross entropy compares each of the predicted probabilities to the actual class output which can be either 0 or 1. It then
calculates the score that penalizes the probabilities based on the distance from the expected value. That means how close or
far from the actual value.

• Categorical Cross Entropy

• Categorical Cross entropy is used for Multiclass classification and softmax regression.

Classification: Personal
Presentation title 57

CHOOSING OUTPUT FUNCTION AND LOSS FUNCTION

• The loss function calculates the error per observation, whilst the cost function calculates the error
over the whole dataset.

Classification: Personal
Presentation title 58

SUMMARY
Type of Neural Network – DLL, CNN, RNN
Gradient descent
Learning factors – W, alpha
Activation function – Linear, Sigmoid, Tanh, ReLu
Loss
Cost

Classification: Personal
THANK YOU
Pranita Mahajan
[email protected]

Classification: Personal

DL Unit 1
No ratings yet
DL Unit 1
200 pages
Deep Learning Report For Students
No ratings yet
Deep Learning Report For Students
32 pages
Neural Networks
No ratings yet
Neural Networks
44 pages
Unit 1 Fundamentals of Deep Learning
No ratings yet
Unit 1 Fundamentals of Deep Learning
20 pages
Two Sigma - LeetCode
No ratings yet
Two Sigma - LeetCode
2 pages
CCS355 NNDL Unit1
No ratings yet
CCS355 NNDL Unit1
30 pages
Deep Learning Final Sheet
No ratings yet
Deep Learning Final Sheet
915 pages
K Maps - Karnaugh Maps - Solved Examples: Minimization of Boolean Expressions
No ratings yet
K Maps - Karnaugh Maps - Solved Examples: Minimization of Boolean Expressions
18 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
DNN Merged Sugata
No ratings yet
DNN Merged Sugata
243 pages
Deep Learning Module 1 Chapter 1
No ratings yet
Deep Learning Module 1 Chapter 1
18 pages
ML-MODULE-4 - Part 2
No ratings yet
ML-MODULE-4 - Part 2
262 pages
Report On Neural Networks
No ratings yet
Report On Neural Networks
15 pages
UNIT-1 Foundations of Deep Learning
100% (1)
UNIT-1 Foundations of Deep Learning
51 pages
Unit 4
100% (1)
Unit 4
57 pages
Integer Programming - 0702112
100% (1)
Integer Programming - 0702112
30 pages
Unit 4-Health Care and Deep Learninh
No ratings yet
Unit 4-Health Care and Deep Learninh
87 pages
Data Structure & Algorithm
No ratings yet
Data Structure & Algorithm
3 pages
DNN - 1 - M1 - Fundamentals of Neural Network
No ratings yet
DNN - 1 - M1 - Fundamentals of Neural Network
95 pages
Deep Learning
100% (3)
Deep Learning
32 pages
AI Chapter 4
No ratings yet
AI Chapter 4
63 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
74 pages
03 Heuristic Search
No ratings yet
03 Heuristic Search
67 pages
Chapter 5
No ratings yet
Chapter 5
63 pages
Unit 4 Hca
No ratings yet
Unit 4 Hca
57 pages
AI Mod4 Session 8 Best Fit Line & ANN
No ratings yet
AI Mod4 Session 8 Best Fit Line & ANN
39 pages
Unit 1
No ratings yet
Unit 1
70 pages
14 Influence
No ratings yet
14 Influence
53 pages
DL - Unit II
No ratings yet
DL - Unit II
78 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Chapter 6 AI
No ratings yet
Chapter 6 AI
52 pages
2 DeepLearning
No ratings yet
2 DeepLearning
46 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Dsa Theory Da
No ratings yet
Dsa Theory Da
41 pages
Presentation On Quantum Logic Gates
No ratings yet
Presentation On Quantum Logic Gates
49 pages
Route Planning in Road Networks
No ratings yet
Route Planning in Road Networks
235 pages
MLT Unit 4 and 5 Part 2
No ratings yet
MLT Unit 4 and 5 Part 2
34 pages
All Merged Chap 3
No ratings yet
All Merged Chap 3
36 pages
Analysing 3 Networks
No ratings yet
Analysing 3 Networks
30 pages
21ca3207 DLDV Unit1
No ratings yet
21ca3207 DLDV Unit1
34 pages
Lecture2 Slides 1
No ratings yet
Lecture2 Slides 1
28 pages
Technical Seminar
No ratings yet
Technical Seminar
27 pages
Introduction To Deep Learning - With Complexe Python and TensorFlow Examples - Jürgen Brauer PDF
No ratings yet
Introduction To Deep Learning - With Complexe Python and TensorFlow Examples - Jürgen Brauer PDF
245 pages
Deep Learning For Natural Language GDG Bloomington 1690248059
No ratings yet
Deep Learning For Natural Language GDG Bloomington 1690248059
41 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
Types of Neural Networks and Definition of Neural Network
No ratings yet
Types of Neural Networks and Definition of Neural Network
15 pages
Unit-4 Full
No ratings yet
Unit-4 Full
48 pages
Practical File: Subject: Analysis and Design of Algorithms Subject Code: BTIT-305
No ratings yet
Practical File: Subject: Analysis and Design of Algorithms Subject Code: BTIT-305
27 pages
PP&DS 5
No ratings yet
PP&DS 5
31 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
Ada Lab Manual
No ratings yet
Ada Lab Manual
32 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
AI Lab 1
No ratings yet
AI Lab 1
11 pages
Introduction Pushdown Automata
No ratings yet
Introduction Pushdown Automata
9 pages
Easy Problems - LeetCode
No ratings yet
Easy Problems - LeetCode
44 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
5 Neural Networks 30-07-2024
No ratings yet
5 Neural Networks 30-07-2024
32 pages
Lecture 08 On Neural Networks 1
No ratings yet
Lecture 08 On Neural Networks 1
15 pages
ABC: An Industrial-Strength Logic Synthesis and Verification Tool
No ratings yet
ABC: An Industrial-Strength Logic Synthesis and Verification Tool
29 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
CSE203 - Assignment-2 - Colaboratory
No ratings yet
CSE203 - Assignment-2 - Colaboratory
12 pages
Deep Learning Fundamentals
No ratings yet
Deep Learning Fundamentals
19 pages
Deep Learning Day 27
No ratings yet
Deep Learning Day 27
43 pages
The Deep Learning Revolution: Introductory Overview Lecture
No ratings yet
The Deep Learning Revolution: Introductory Overview Lecture
35 pages
Group I
No ratings yet
Group I
20 pages
Shortnotedeeplearning
No ratings yet
Shortnotedeeplearning
11 pages
Deep Learning
No ratings yet
Deep Learning
18 pages
What Are Neural Networks
No ratings yet
What Are Neural Networks
5 pages
Regula Falsi
No ratings yet
Regula Falsi
51 pages
Mae 493G, Cpe 493M, Mobile Robotics: 11. Introduction To Robot Planning
No ratings yet
Mae 493G, Cpe 493M, Mobile Robotics: 11. Introduction To Robot Planning
24 pages
Deep Learnig
No ratings yet
Deep Learnig
16 pages
CP4252 ML Unit - V
No ratings yet
CP4252 ML Unit - V
17 pages
Image Segmentation Using A Generalized Fast Marching Method
No ratings yet
Image Segmentation Using A Generalized Fast Marching Method
52 pages
MVDAFT Final
No ratings yet
MVDAFT Final
30 pages
PATTERN CLASSIFICATION BY DISTANCE FUNCTIONS BY Dr. K.Vijayarekha
No ratings yet
PATTERN CLASSIFICATION BY DISTANCE FUNCTIONS BY Dr. K.Vijayarekha
8 pages
Deep Neural Network
No ratings yet
Deep Neural Network
12 pages
MATLAB Code For Solving Traffic Signal Timing Optimization Using The Firefly Optimization Algorithm
No ratings yet
MATLAB Code For Solving Traffic Signal Timing Optimization Using The Firefly Optimization Algorithm
5 pages
BSC 4 Sem Cs Compiler Construction 402 p13 Apr 2017
No ratings yet
BSC 4 Sem Cs Compiler Construction 402 p13 Apr 2017
4 pages
Python Operators
No ratings yet
Python Operators
5 pages
Comp Prob
No ratings yet
Comp Prob
8 pages
Experiment 5 Relational and Logical Operation
No ratings yet
Experiment 5 Relational and Logical Operation
3 pages
Char CRC5
No ratings yet
Char CRC5
6 pages
Formal Language and Automata Theory 20CSC19
No ratings yet
Formal Language and Automata Theory 20CSC19
2 pages
Chapter 7. Advanced Counting Techniques
No ratings yet
Chapter 7. Advanced Counting Techniques
8 pages
Codes in MATLAB For Particle Swarm Optimization: March 2016
No ratings yet
Codes in MATLAB For Particle Swarm Optimization: March 2016
4 pages
Question Bank: Subject Code: Subject Name
No ratings yet
Question Bank: Subject Code: Subject Name
5 pages
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Top Google LeetCode Questions
No ratings yet
Top Google LeetCode Questions
121 pages

Deep Learning

Uploaded by

Deep Learning

Uploaded by

DEEP LEARNING

Synergize scalable Disseminate standardized Coordinate e- Foster holistically Deploy strategic

MACHINE ML LIMITATION DEEP LEARNING DL PROBLEM

HUMAN BRAIN NEURAL NETWORK

Can it be considered as binary classification problem

One of the many possible linear

Hence, with single neuron with threshold non

Similarly OR Logic can be build

As AND & OR both are linearly

XOR(X1, X2) = OR(X1,X2) AND NAND(X1,X2)

In every layer it computes non-linear function

Cascade of Neural network gives overall function

In final layer (kth layer) solve this problem as linearly separable

Frank Rosenblatt’s model

Frank Rosenblatt’s model

THREE CLASSES OF DEEP

RECURRENT NEURAL NETWORK

TRAINING, OPTIMIZATION AND

TRAINING FEEDFORWARD DNN

COMPARING TANH WITH SIGMOID

It’s main advantage is that it avoids and

COMPARING RELU WITH SIGMOID

• It prevents dying ReLU

SIGMOID WITH Click icon to add picture

Sigmoid’s probabilities produced by a Sigmoid are independent. Furthermore, they

LOSS AND LOSS FUNCTIONS FOR

WHAT WE WILL STUDY

SQUARED ERROR LOSS

Note – In regression at the last neuron use linear

CROSS ENTROPY LOSS

• Categorical Cross Entropy

CHOOSING OUTPUT FUNCTION AND LOSS FUNCTION

You might also like