0% found this document useful (0 votes)

12 views

Deep Learning Notes-2

Uploaded by

nahilnaik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Deep Learning Notes-2

Uploaded by

nahilnaik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

AM11 Om Nagvekar

Deep Learning Notes

Artificial Neural Network (ANN)

Activation Function:-
• Introduces non-linearity in the output of neuron.
• In neural network if all hidden are linear it is treated as one hidden layer
unless or until non linearity is introduced.
• The primary role of the Activation Function is to transform the summed
weighted input from the node into an output value to be fed to the next
hidden layer or as output.
• It is generally at output layer
• Activation functions : linear, ReLu, sigmoid , tanh, softmax, etc

1|Page
AM11 Om Nagvekar

• In linear activation function the derivative is constant the model

does not linear anything from it
• f(x) = x
f(x) = 1

• value exist between 0 to 1.

• In deep learning, the gradient of a sigmoid activation function is
used to update the weights & biases of a neural network.

2|Page
AM11 Om Nagvekar

• It is often used as the last activation function of a neural

network to normalize the output of a network to a probability
distribution over predicted output classes.
• Softmax is an activation function that scales numbers/logits
into probabilities.

3|Page
AM11 Om Nagvekar

Loss Function :
• A loss function calculates the error between actual value and predicted
value so model can make changes in weight in next iteration
• Gradually with the help of some optimization algorithm loss function
learns to reduce the loss.
• It is used in CNN, ANN, RNN, DNN, etc

Cost Function :
• If you have multiple training values it will be cost function
• Cost function in overall loss over the model training
Regression Loss :
• It is used in regression problems like linear regression.
• Eg. MSE (Mean Squrd Error also known as L2 Loss) , Mean Absolute
Error, Mean Bias Error, Epsilion Error

• MAE is better than MSE because it reduces the impact of outlier on

output or model.
Classification Loss :
• Binary Cross Entropy, Balanced Cross Entropy, Hinge Loss, Softmax
Loss, Active contour Loss, etc.
• Sigmoid Cross Entropy or log likelihood. It is a sigmoid activation plus
a cross entropy loss

• Weighted Cross entropy is when there is unbalanced class

• Balanced cross entropy is same as weighted cross entropy
• Categorical cross entropy it is called also softmax loss

4|Page
AM11 Om Nagvekar

How on each Layer matrix dimension changes ?

n^[l] * n^[l-1]
here n No. of is neuron on the layer
l is layer
l-1 is previous layer

5|Page
AM11 Om Nagvekar

Gradient Descent:

• How to adjust weight if loss is higher:

W = W- alpha* dj/dw
Here w is Weight
Alpha is learning rate
dj/dw is sloper at that point
• alpha is commonly between 0.1 to 0.001
• alpha is also called as Hyperparameter
• This is example of stochastic gradient descent (sgd) optimizer
• Types are sgd, batch gd mini batch gd
• Batch-size is normally in power of >=2^8
Chain Rule:

6|Page
AM11 Om Nagvekar

Vanishing Gradient problem :

• The vanishing gradient problem is a phenomenon that occurs during the
training of deep neural networks. It happens when the gradients used to
update the network become too small or "vanish" as they are
backpropagated from the output layers to the earlier layers.
• The exploding gradient problem occurs when gradients become very large
during backpropagation. This can lead to a rapid increase in values as they
are propagated backward through the layers.
• Solution
o Reduce the complexity
o Weight initialization
o Residual Path
o Use ReLu or Leaky Leaky ReLu activation function
o Batch Normalization
o Gradient clipping
Hyper Parameter :
o Learning rate, epoch, optimizer (Adam, sgd, RMSprop), Activation
function, Loss function, No. of Neurons, No. of Layers, batch size, train,
Validation data, Drop out
Variance And Bias:
Regularization:
Regularization is adding a component noise in loss function and minimize the
same.
Regularization refers to techniques that are used to calibrate machine learning
models in order to minimize the adjusted loss function and prevent overfitting or
underfitting
1. L2 Regularization:
o L2 regularization is a regularization technique used in deep learning to
prevent neural networks from overfitting on training data. It's also known
as weight decay or Ride Regression.
o L2 regularization prevents weights from becoming too large.
o L2 regularization ensures that the important components in the weight
vector are larger than the other components.
o L2 regularization adds the square of the weights to the loss function. This
tends to evenly distribute the importance across all features, reducing the
magnitude of weights and preventing them from growing too large.

7|Page
AM11 Om Nagvekar

Cost function = Loss + λ * ∑||w||^2

In the cost function, the penalty term is represented by Lambda λ By changing
the values of the penalty function, we are controlling the penalty term The
higher the penalty, it reduces the magnitude of coefficients It shrinks the
parameters Therefore, it is used to prevent multicollinearity, and it reduces the
model complexity by coefficient shrinkage
Loss = 0 (considering the two points on the)
λ=1 w = 1.4
Then, Cost function = 0 + 1 x ( 1.4 )^2 = 1.96
•For Ridge Regression, let s assume,
Loss = 0.13 λ = 1 w = 0.7
Then, Cost function = 0.13 + 1 x ( 0.7)^ 2 = 0.62
Note : For L1 Regularization Cost function = Loss + λ * ∑||w||

Exponentially weighted Moving Average:

8|Page
AM11 Om Nagvekar

RMSProp :

9|Page
AM11 Om Nagvekar

Adam Optimizer:

where
Alpha is learning rate
Beta is exponential moving sum (momentum): dw^2
Beta 2 is RMSProp

10 | P a g e
AM11 Om Nagvekar

AdaGrad :

There is chance that After some epoch learning stops because of beta is not
there in adagrad

Internal Covariate shift :

• The term interal covariate shift comes from the paper Batch
Normalization: Accelerating Deep Network Training by Reducing
Internal Covariate Shift.
• We define Internal Covariate Shift as the change in the distribution of
network activations due to the change in network parameters during
training.
• In neural networks, the output of the first layer feeds into the second
layer, the output of the second layer feeds into the third, and so on. When
the parameters of a layer change, so does the distribution of inputs to
subsequent layers.
• These shifts in input distributions can be problematic for neural networks,
especially deep neural networks that could have a large number of layers.
• Batch normalization is a method intended to mitigate internal covariate
shift for neural networks.
• In Batch normailization substract mean and divide by Square root of
Standard Deviation.

11 | P a g e
AM11 Om Nagvekar

Convolutional Neural Network (CNN):

• The one advantage Of CNN over ANN Is that it shares weight so it
requires less weight compared to ANN if ANN has 100 Neuron it will
requires 100*100 weights.

12 | P a g e
AM11 Om Nagvekar

LNet-5:

Alexnet :

13 | P a g e
AM11 Om Nagvekar

Google Net:

14 | P a g e
AM11 Om Nagvekar

ResNet:

15 | P a g e
AM11 Om Nagvekar

GAN(Generative Adversarial Network) :

16 | P a g e

DL M2 Tech
No ratings yet
DL M2 Tech
32 pages
AD3451 ML UNIT 4 NOTES
No ratings yet
AD3451 ML UNIT 4 NOTES
36 pages
DL Unit2 HD
No ratings yet
DL Unit2 HD
141 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Activation Function To Back Pro
No ratings yet
Activation Function To Back Pro
22 pages
Ad3451 Ml Unit 4 Notes
No ratings yet
Ad3451 Ml Unit 4 Notes
34 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
59 pages
EE769 7 Introduction To Neural Networks
No ratings yet
EE769 7 Introduction To Neural Networks
52 pages
Unit 2b
No ratings yet
Unit 2b
11 pages
4-Neural Networks and Activation Function
No ratings yet
4-Neural Networks and Activation Function
28 pages
06 AIS302 ANN backpropagation
No ratings yet
06 AIS302 ANN backpropagation
83 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
unit 2 DL
No ratings yet
unit 2 DL
70 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
Unit 2
No ratings yet
Unit 2
31 pages
Different Activation Functions With The Equations
No ratings yet
Different Activation Functions With The Equations
6 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Module 2
No ratings yet
Module 2
13 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
Week 2 Artificial Neural Networks
No ratings yet
Week 2 Artificial Neural Networks
62 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
Study of Ensemble of Activation Functions in Deep Learning
No ratings yet
Study of Ensemble of Activation Functions in Deep Learning
10 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
Artificial Neural Network Notes
No ratings yet
Artificial Neural Network Notes
9 pages
Deep+Learning+Module-02+Search+Creators
No ratings yet
Deep+Learning+Module-02+Search+Creators
15 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
ml
No ratings yet
ml
10 pages
mod4
No ratings yet
mod4
65 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
Deep Learning Tutorial 9
No ratings yet
Deep Learning Tutorial 9
70 pages
Deep Learning 15
No ratings yet
Deep Learning 15
13 pages
Unit 4
No ratings yet
Unit 4
19 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
CII4Q3 - Computer Vision-EAR - Week-11-Intro To Deep Learning v1.0
No ratings yet
CII4Q3 - Computer Vision-EAR - Week-11-Intro To Deep Learning v1.0
50 pages
Upload_Unit_2
No ratings yet
Upload_Unit_2
19 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Unit 2
No ratings yet
Unit 2
112 pages
Activation Functions and Their Characteristics in Deep Neural Networks
No ratings yet
Activation Functions and Their Characteristics in Deep Neural Networks
6 pages
26 Neural Nets
No ratings yet
26 Neural Nets
77 pages
Forward_and_Backward_Propagation_Deep_Learning_1703697260
No ratings yet
Forward_and_Backward_Propagation_Deep_Learning_1703697260
9 pages
ANN_Presentation_Exam_Hafsa
No ratings yet
ANN_Presentation_Exam_Hafsa
29 pages
M2 PPT
No ratings yet
M2 PPT
84 pages
a imprimer 4
No ratings yet
a imprimer 4
4 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
Ch2-Training, Optimization and Regularization of DNN-new (1)
No ratings yet
Ch2-Training, Optimization and Regularization of DNN-new (1)
114 pages
Feed Forward NN
No ratings yet
Feed Forward NN
35 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Unit 2_Activation Function_PR
No ratings yet
Unit 2_Activation Function_PR
22 pages
IV Ai & Ds Al3451 Ml Unit4 Qb
No ratings yet
IV Ai & Ds Al3451 Ml Unit4 Qb
6 pages
NeuralNetworks
No ratings yet
NeuralNetworks
29 pages
Artificial Neural Networks - Lect - 4
No ratings yet
Artificial Neural Networks - Lect - 4
17 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Fahad Yasin1
No ratings yet
Fahad Yasin1
5 pages
Isomerism
No ratings yet
Isomerism
69 pages
Sir Zaheer'S Academy: Third Order
No ratings yet
Sir Zaheer'S Academy: Third Order
19 pages
Reading-Ecology, Overpopulation and Economic Development With The Answer Key
No ratings yet
Reading-Ecology, Overpopulation and Economic Development With The Answer Key
7 pages
(IJCST-V12I6P2) :arushi Shrivastava, Khushboo Panjwani, Goldi Soni
No ratings yet
(IJCST-V12I6P2) :arushi Shrivastava, Khushboo Panjwani, Goldi Soni
6 pages
Blood Wars 02 - Abyssal Warriors (J. Robert King) (v5.0)
No ratings yet
Blood Wars 02 - Abyssal Warriors (J. Robert King) (v5.0)
334 pages
Food in Staff Canteen
No ratings yet
Food in Staff Canteen
4 pages
Solid / Gas Separation With Fabric Filters
No ratings yet
Solid / Gas Separation With Fabric Filters
2 pages
ELEC3106 Lab 1 - Op-Amp Measurements
No ratings yet
ELEC3106 Lab 1 - Op-Amp Measurements
3 pages
Flow-induced Vibrations: an Engineering Guide : IAHR Hydraulic Structures Design Manuals 7 First Edition Eduard Naudascher download
100% (1)
Flow-induced Vibrations: an Engineering Guide : IAHR Hydraulic Structures Design Manuals 7 First Edition Eduard Naudascher download
54 pages
godzilla baby crochet
No ratings yet
godzilla baby crochet
9 pages
DR Munir Munshey Quran Translation
No ratings yet
DR Munir Munshey Quran Translation
477 pages
TYPES OF BED MAKING
No ratings yet
TYPES OF BED MAKING
32 pages
Form 1: Vocabulary in Alcpt
No ratings yet
Form 1: Vocabulary in Alcpt
7 pages
Epilepsy in Pregnancy
No ratings yet
Epilepsy in Pregnancy
10 pages
FINALE LEUKEMIA Case-Study
100% (2)
FINALE LEUKEMIA Case-Study
68 pages
Renr9353-02-01-All (Troubleshooting)
No ratings yet
Renr9353-02-01-All (Troubleshooting)
188 pages
University Results 2020-2021
No ratings yet
University Results 2020-2021
37 pages
Final Serene Lakeview Apartments Brochure
No ratings yet
Final Serene Lakeview Apartments Brochure
18 pages
Cancer Incidence in United Arab Emirates Annual Report of the Uae - 2023 (2)
No ratings yet
Cancer Incidence in United Arab Emirates Annual Report of the Uae - 2023 (2)
75 pages
Chemistry Grade X Prelim Paper 2019 Changed 123 - 1
No ratings yet
Chemistry Grade X Prelim Paper 2019 Changed 123 - 1
4 pages
Rotateq Epar Product Information en
No ratings yet
Rotateq Epar Product Information en
29 pages
Mechanical Design Engineer: Muhammad Waqas
No ratings yet
Mechanical Design Engineer: Muhammad Waqas
1 page
Powder Metallurgy Powders & Equipment Manufacturers
No ratings yet
Powder Metallurgy Powders & Equipment Manufacturers
2 pages
Lulama The First Wife
No ratings yet
Lulama The First Wife
429 pages
Math 5 4th PT
No ratings yet
Math 5 4th PT
22 pages
Review of The Ch. Karnchang Public Company Limited Environmental Impact Assessment (Xayaburi Dam) by
No ratings yet
Review of The Ch. Karnchang Public Company Limited Environmental Impact Assessment (Xayaburi Dam) by
9 pages
Automotive Thermoelectric Generators and HVAC
No ratings yet
Automotive Thermoelectric Generators and HVAC
29 pages
Warehousing Provisions: Vivekananda Reddy, Vikash Kumar, Susanta Mishra
No ratings yet
Warehousing Provisions: Vivekananda Reddy, Vikash Kumar, Susanta Mishra
34 pages
ALMATY 5N6D
No ratings yet
ALMATY 5N6D
12 pages

Deep Learning Notes-2

Uploaded by

Deep Learning Notes-2

Uploaded by

AM11 Om Nagvekar

Deep Learning Notes

• In linear activation function the derivative is constant the model

• value exist between 0 to 1.

• It is often used as the last activation function of a neural

• MAE is better than MSE because it reduces the impact of outlier on

• Weighted Cross entropy is when there is unbalanced class

How on each Layer matrix dimension changes ?

• How to adjust weight if loss is higher:

Vanishing Gradient problem :

Cost function = Loss + λ * ∑||w||^2

Exponentially weighted Moving Average:

Internal Covariate shift :

Convolutional Neural Network (CNN):

GAN(Generative Adversarial Network) :

You might also like