0% found this document useful (0 votes)

36 views19 pages

UNIT 2 Notes

Deep learning

Uploaded by

Anami

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views19 pages

UNIT 2 Notes

Deep learning

Uploaded by

Anami

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

UNIT 2 Notes

History of Deep Learning- A Probabilistic Theory of Deep Learning- Backpropagation and

regularization, batch normalization- VC Dimension and Neural Nets-Deep Vs Shallow
Networks Convolutional Networks- Generative Adversarial Networks (GAN), Semi-
supervised Learning

History of Deep Learning [DL]:

 The chain rule that underlies the back-propagation algorithm was invented in the
seventeenth century (Leibniz, 1676; L’ Hôpital, 1696)
 Beginning in the 1940s, the function approximation techniques were used to
motivate machine learning models such as the perceptron
 The earliest models were based on linear models. Critics including Marvin
Minsky pointed out several of the laws of the linear model family, such as its
inability to learn the XOR function, which led to a backlash against the entire
neural network approach
 Efficient applications of the chain rule based on dynamic programming began to
appear in the 1960s and 1970s
 Werbos (1981) proposed applying chain rule techniques for training artificial
neural networks. The idea was finally developed in practice after being
independently rediscovered in different ways (LeCun, 1985; Parker, 1985;
Rumelhart et al., 1986a)
 Following the success of back-propagation, neural network research gained
popularity and reached a peak in the early 1990s. Afterwards, other machine
learning techniques became more popular until the modern deep learning
renaissance that began in 2006
 The core ideas behind modern feedforward networks have not changed
substantially since the 1980s. The same back-propagation algorithm and the
same approaches to gradient descent are shall in use

1
A Probabilistic Theory of Deep Learning

 Probability is the science of quantifying uncertain things.

 Most of machine learning and deep learning systems utilize a lot of data to learn
about patterns in the data.
 Whenever data is utilized in a system rather than sole logic, uncertainty grows up
and whenever uncertainty grows up, probability becomes relevant.
 By introducing probability to a deep learning system, we introduce common sense
to the system.
 In deep learning, several models like Bayesian models, probabilistic graphical
models, Hidden Markov models are used.
 They depend entirely on probability concepts.
Real world data is chaotic. Since deep learning systems utilize real world data, they require a
tool to handle the chaoticness.
Back Propagation Networks (BPN)
Need for Multilayer Networks

 Single Layer networks cannot be used to solve Linear Inseparable problems & can
only be used to solve linear separable problems
 Single layer networks cannot solve complex problems
 Single layer networks cannot be used when large input-output data set is available
 Single layer networks cannot capture the complex information’s available in the
training pairs
Hence to overcome the above said Limitations we use Multi-Layer Networks.
Multi-Layer Networks

 Any neural network which has at least one layer in between input and output layers is
called Multi-Layer Networks
 Layers present in between the input and out layers are called Hidden Layers
 Input layer neural unit just collects the inputs and forwards them to the next higher
layer
 Hidden layer and output layer neural units process the information’s feed to them and
produce an appropriate output
 Multi -layer networks provide optimal solution for arbitrary classification problems
 Multi -layer networks use linear discriminants, where the inputs are non linear

2
Back Propagation Networks (BPN)
Introduced by Rumelhart, Hinton, & Williams in 1986.

 BPN is a Multilayer Feedforward Network but error is back propagated, Hence

the name Back Propagation Network (BPN).
 It uses Supervised Training process; it has a systematic procedure for training
the network and is used in Error Detection and Correction. Generalized Delta
Law /Continuous Perceptron Law/ Gradient Descent Law is used in this network.
 Generalized Delta rule minimizes the mean squared error of the output calculated
from the output.
 Delta law has faster convergence rate when compared with Perceptron Law. It is
the extended version of Perceptron Training Law. Limitations of this law is the
Local minima problem.
 Due to this the convergence speed reduces, but it is better than perceptron’s.
Figure 1 represents a BPN network architecture.
 Even though Multi level perceptron’s can be used they are flexible and efficient
that BPN.
 In figure 1 the weights between input and the hidden portion is considered as Wij
and the weight between first hidden to the next layer is considered as Vjk.
 This network is valid only for Differential Output functions. The Training
process used in backpropagation involves three stages, which are listed as below
1. Feedforward of input training pair
2. Calculation and backpropagation of associated error
3. Adjustments of weights

3
BPN Algorithm

The algorithm for BPN is as classified int four major steps as follows:
1. Initialization of Bias, Weights
2. Feedforward process
3. Back Propagation of Errors
4. Updating of weights & biases
Algorithm:
I. Initialization of weights:
Step 1: Initialize the weights to small random values near zero
Step 2: While stop condition is false , Do steps 3 to 10
Step 3: For each training pair do steps 4 to 9
II. Feed forward of inputs
Step 4: Each input xi is received and forwarded to higher layers (next hidden)
Step 5: Hidden unit sums its weighted inputs as follows Zinj = Woj + Σxiwij
Applying Activation function Zj = f(Zinj)
This value is passed to the output layer
Step 6: Output unit sums it’s weighted inputs yink= Voj + Σ ZjVjk
Applying Activation function
Yk = f(yink)

III. Backpropagation of Errors

Step 7: δk = (tk – Yk)f(yink )
Step 8: δinj = Σ δjVjk

IV. Updating of Weights & Biases

Step 8: Weight correction is bias
Correction is Δwij =
αδkZjΔwoj =
V. Updating of Weights & Biases
αδk
Step 9: continued:
New Weight is
Wij(new) = Wij(old) + Δwij Vjk(new) = Vjk(old) + ΔVjk
New bias is

4
Woj(new) = Woj(old) + Δwoj Vok(new) = Vok(old) + ΔVok
Step 10: Test for Stop Condition

Merits
•Has smooth effect on weight correction •Computing time is less if weight’s are small •100
times faster than perceptron model

• Has a systematic weight updating procedure

Demerits
• Learning phase requires intensive calculations
• Selection of number of Hidden layer neurons is an issue
• Selection of number of Hidden layers is also an issue
• Network gets trapped in Local Minima
• Temporal Instability
• Network Paralysis
• Training time is more for Complex problems
Regularization
A fundamental problem in machine learning is how to make an algorithm that will perform
well not just on the training data, but also on new inputs. Many strategies used in machine
learning are explicitly designed to reduce the test error, possibly at the expense of increased
training error. These strategies are known collectively as regularization.
Definition: - “any modification we make to a learning algorithm that is intended to reduce its
generalization error but not its training error.”

L1 In the context of deep learning, most regularization strategies are based on regularizing
estimators.

L2Regularization of an estimator works by trading increased bias for reduced variance.

An effective regularizer is one that makes a profitable trade, reducing variance significantly
while not overly increasing the bias.

 Many regularization approaches are based on limiting the capacity of models, such as
neural networks, linear regression, or logistic regression, by adding a parameter norm
penalty Ω(θ) to the objective function J. We denote the regularized objective function
by J˜
J˜(θ; X, y) = J(θ; X, y) + αΩ(θ)

5
where α ∈ [0, ∞) is a hyperparameter that weights the relative contribution of the norm penalty
term, Ω, relative to the standard objective function J. Setting α to 0 results in no regularization.
Larger values of α correspond to more regularization.

The parameter norm penalty Ω that penalizes only the weights of the a ffine transformation at
each layer and leaves the biases unregularized
L2 Regularization

One of the simplest and most common kind of parameter norm penalty is L2 parameter
& it’s also called commonly as weight decay. This regularization strategy drives the
weights
closer to the origin by adding a regularization term . L2
regularization is also known as ridge regression or Tikhonov regularization. To
simplify, we assume no bias parameter, so θ is just w. Such a model has the following
total objective function
Difference between L1 & L2 Parameter Regularization

 L1 regularization attempts to estimate the median of data, L2 regularization

makes estimation for the mean of the data in order to evade overfitting.
 L1 regularization can add the penalty term in cost function. But L2 regularization
appends the squared value of weights in the cost function.
 L1 regularization can be helpful in features selection by eradicating the unimportant
features, whereas, L2 regularization is not recommended for feature selection
 L1 doesn’t have a closed form solution since it includes an absolute value and it is a
non differentiable function, while L2 has a solution in closed form as it’s a square of a
weight

6
Batch Normalization:
It is a method of adaptive reparameterization, motivated by the difficulty of training
very deep models.In Deep networks, the weights are updated for each layer. So the
output will no longer be on the same scale as the input (even though input is
normalized).Normalization - is a data pre-processing tool used to bring the numerical
data to a common scale without distorting its shape.when we input the data to a machine
or deep learning algorithm we tend to change the values to a balanced scale because,
we ensure that our model can generalize appropriately.(Normalization is used to bring
the input into a balanced scale/ Range)
Procedure to do Batch Normalization:
(1) Consider the batch input from layer h, for this layer we need to calculate the mean
of this hidden activation.
(2) After calculating the mean the next step is to calculate the standard deviation of the
hidden activations.
(3) Now we normalize the hidden activations using these Mean & Standard Deviation
values. To do this, we subtract the mean from each input and divide the whole value
with the sum of standard deviation and the smoothing term (ε).

7
(4) As the final stage, the re-scaling and offsetting of the input is performed. Here two
components of the BN algorithm is used, γ(gamma) and β (beta). These parameters are
used for re-scaling (γ) and shifting(β) the vector contains values from the previous
operations.
These two parameters are learnable parameters, Hence during the training of neural
network, the optimal values of γ and β are obtained and used. Hence we get the accurate
normalization of each batch
Shallow Networks
Shallow neural networks give us basic idea about deep neural network which consist of
only 1 or 2 hidden layers. Understanding a shallow neural network gives us an
understanding into what exactly is going on inside a deep neural network A neural
network is built using various hidden layers. Now that we know the computations that
occur in a particular layer, let us understand how the whole neural network computes
the output for a given input X. These can also be called the forward-propagation
equations.

8
Difference Between a Shallow Net & Deep Learning Net:

9
Convolution Networks:

Imagine there’s an image of a bird, and you want to identify whether it’s really a bird or some
other object. The first thing you do is feed the pixels of the image in the form of arrays to the
input layer of the neural network (multi-layer networks used to classify things). The hidden
layers carry out feature extraction by performing different calculations and manipulations.
There are multiple hidden layers like the convolution layer, the ReLU layer, and pooling layer,
that perform feature extraction from the image. Finally, there’s a fully connected layer that
identifies the object in the image.

10
What is Convolutional Neural Network?
A convolutional neural network is a feed-forward neural network that is generally used to analyze visual
images by processing data with grid-like topology. It’s also known as a ConvNet. A convolutional
neural network is used to detect and classify objects in an image.

Below is a neural network that identifies two types of flowers: Orchid and Rose.

Layers in a Convolutional Neural Network

A convolution neural network has multiple hidden layers that help in extracting information
from an image. The four important layers in CNN are:

1. Convolution layer

2. ReLU layer

3. Pooling layer

4. Fully connected layer

5. ReLU layer/ Activation Layer

6. Flattening

7. Output Layer

Convolution Layer

This is the first step in the process of extracting valuable features from an image. A convolution
layer has several filters that perform the convolution operation. Every image is considered as a
matrix of pixel values.

11
Consider the following 5x5 image whose pixel values are either 0 or 1. There’s also a filter
matrix with a dimension of 3x3. Slide the filter matrix over the image and compute the dot
product to get the convolved feature matrix.

ReLU layer

ReLU stands for the rectified linear unit. Once the feature maps are extracted, the next step is
to move them to a ReLU layer. ReLU performs an element-wise operation and sets all the
negative pixels to 0. It introduces non-linearity to the network, and the generated output is
a rectified feature map. Below is the graph of a ReLU function:

The original image is scanned with multiple convolutions and ReLU layers for locating the
features.

12
Pooling Layer

Pooling is a down-sampling operation that reduces the dimensionality of the feature map. The
rectified feature map now goes through a pooling layer to generate a pooled feature map.

13
The pooling layer uses various filters to identify different parts of the image like edges, corners,
body, feathers, eyes, and beak.

Here’s how the structure of the convolution neural network looks so far:

The next step in the process is called flattening. Flattening is used to convert all the resultant
2-Dimensional arrays from pooled feature maps into a single long continuous linear vector.

14
The flattened matrix is fed as input to the fully connected layer to classify the image.

Here’s how exactly CNN recognizes a bird:

 The pixels from the image are fed to the convolutional layer that performs the convolution
operation

 It results in a convolved map

 The convolved map is applied to a ReLU function to generate a rectified feature map

 The image is processed with multiple convolutions and ReLU layers for locating the
features

15
 Different pooling layers with various filters are used to identify specific parts of the image

 The pooled feature map is flattened and fed to a fully connected layer to get the final output

 Activation Layer

The activation layer introduces nonlinearity into the network by applying an activation function
to the output of the previous layer. This is crucial for the network to learn complex patterns.
Common activation functions, such as ReLU, Tanh, and Leaky ReLU, transform the input
while keeping the output size unchanged.

 Flattening

After the convolution and pooling operations, the feature maps still exist in a multi-dimensional
format. Flattening converts these feature maps into a one-dimensional vector. This process is
essential because it prepares the data to be passed into fully connected layers for classification
or regression tasks.

 Output Layer

In the output layer, the final result from the fully connected layers is processed through a
logistic function, such as sigmoid or softmax. These functions convert the raw scores into
probability distributions, enabling the model to predict the most likely class label.

Generative Adversarial networks

Deep Learning and Neural networks, a part of Machine Learning, are such powerful technologies that
are capable of generating new human faces from scratch that did not even exist before but appear natural

16
with the help of training data, and this is possible with the technology named GAN or Generative
Adversarial Networks.

Generative adversarial networks (GANs) are among the most popular and recent unsupervised
machine learning innovations developed by Ian J. Goodfellow in 2014.

 GAN is a class of algorithmic machine learning framework having two neural networks that
connect and can analyze, capture and copy the variations within a dataset.
 Both neural networks work against one another in GAN machine learning, hence called
adversarial networks.
 It is most often used in various ML applications, such as image generation, video generation,
and speech generation.

A Generative Adversarial Network or GAN is defined as the technique of generative modeling used
to generate new data sets based on training data sets. The newly generated data set appears similar
to the training data sets.

o Generative: It is used to learn a generative model that visually explains how data is
generated.
o Adversarial: As both neural networks compete with each other or are adversarial to
one another, hence training of the model is done in an adversarial manner.
o Networks: It uses deep neural networks to train models, hence called networks.

o Discriminator: It is used as a supervised machine learning approach in which a simple

classifier is appointed to discriminate between real and fake data. Although, it is trained
on actual training data sets and gives feedback to the generator.
o Generator: Unlike the discriminator, the generator is an unsupervised machine
learning method used to generate fake samples based on actual training data sets. It is

17
also a neural network with hidden layers, activation, and loss function.
Further, the generator primarily focuses on generating fake data based on feedback
given by the discriminator and makes the discriminator fool so that it cannot identify
the difference between actual output and generated output by the generator.

(GAN)s
DCGAN: DCGAN or Deep Convolutional GAN is one of the most famous implementations
of GAN. It makes use of ConvNets instead of Multi-layered perceptron. Contents use a
convolutional stride and are built without max pooling. Further, layers in ConvNets are not
entirely connected.
Conditional and Unconditional GAN: It is defined as a deep learning neural network having
extra parameters. In conditional and unconditional GAN, labels are kept in such a way so that
they can easily classify the input of the discriminator.
Least Square GAN: It is a particular type of generative adversarial network that uses the least-
square loss function for the discriminator. Further, whenever the objective function of least
square GAN is minimized, Pearson divergence also gets minimized automatically.
Auxiliary Classifier GAN: ACGAN or Auxiliary Classifier GAN is a similar but improved
version of CGAN. Its discriminator not only classifies an image as real or fake but also gives
information about the source of the input image.
Dual Video Discriminator GAN: It is the most helpful type of GAN for video generation built
upon the BigGAN architecture. Further, it uses a spatial and temporal discriminator for
generating videos.
SRGAN: Super Resolution or SRGAN is also known as domain transformation, primarily used
to transform low-resolution images to high resolution.

18
Cycle GAN: It is used to perform image translation. E.g., we have trained it on a horse image
dataset, and we can translate it into zebra images.
Info GAN is the latest and advanced version of generative adversarial networks used for
unsupervised machine learning.

Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
A Probabilistic Theory of Deep Learning: Unit 2
100% (1)
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
26 pages
Back Propagation Back Propagation Network Network Network Network
No ratings yet
Back Propagation Back Propagation Network Network Network Network
29 pages
Baroda Companies
100% (1)
Baroda Companies
25 pages
Neural Network Module 2 Notes
100% (1)
Neural Network Module 2 Notes
72 pages
ANN Notes Updated
0% (1)
ANN Notes Updated
46 pages
CIMA Syllabus Final
No ratings yet
CIMA Syllabus Final
128 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
Richi's Neural Nets Summary
No ratings yet
Richi's Neural Nets Summary
114 pages
Neural Networks For Machine Learning: Lecture 9a Overview of Ways To Improve Generalization
No ratings yet
Neural Networks For Machine Learning: Lecture 9a Overview of Ways To Improve Generalization
39 pages
Multi Layer Feed-Forward Network Learning
No ratings yet
Multi Layer Feed-Forward Network Learning
5 pages
Introduction To Neural Networks: John Paxton Montana State University Summer 2003
No ratings yet
Introduction To Neural Networks: John Paxton Montana State University Summer 2003
24 pages
L05 Slides - mlp2
No ratings yet
L05 Slides - mlp2
21 pages
Unit 3
No ratings yet
Unit 3
110 pages
Bitwise Neural Network
No ratings yet
Bitwise Neural Network
5 pages
L04 Slides - mlp1
No ratings yet
L04 Slides - mlp1
22 pages
CV 3
No ratings yet
CV 3
159 pages
A Imprimer 4
No ratings yet
A Imprimer 4
4 pages
Unit Online 1.4
No ratings yet
Unit Online 1.4
132 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Chapter-2 (Deep Learning)
No ratings yet
Chapter-2 (Deep Learning)
18 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
CI-6-8 Backpropagation (COMPLETE) Updated
No ratings yet
CI-6-8 Backpropagation (COMPLETE) Updated
76 pages
Module 3 - Modified
No ratings yet
Module 3 - Modified
106 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
Supervised Learning: Multilayer Networks I
No ratings yet
Supervised Learning: Multilayer Networks I
40 pages
Chap 2 Training Feed Forward Neural Networks
No ratings yet
Chap 2 Training Feed Forward Neural Networks
22 pages
Cours 4
No ratings yet
Cours 4
30 pages
NN Ch3
No ratings yet
NN Ch3
40 pages
Shortcomings in Single Layer Neural Networks: Most Real World Problems Are Not
No ratings yet
Shortcomings in Single Layer Neural Networks: Most Real World Problems Are Not
43 pages
UNIT 3 - Backpropagation Algorithm
No ratings yet
UNIT 3 - Backpropagation Algorithm
38 pages
Hernandez Lobatoc15
No ratings yet
Hernandez Lobatoc15
9 pages
RBFN and TDNN
No ratings yet
RBFN and TDNN
42 pages
Module - 2 Ver 1.4
No ratings yet
Module - 2 Ver 1.4
35 pages
Week 3
No ratings yet
Week 3
15 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Deep Feedforward Networks Application To Patter Recognition
No ratings yet
Deep Feedforward Networks Application To Patter Recognition
5 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
Unit 2
No ratings yet
Unit 2
31 pages
Lec 15 MLP Cont
No ratings yet
Lec 15 MLP Cont
34 pages
Backpropagation
No ratings yet
Backpropagation
6 pages
Backpropagation
No ratings yet
Backpropagation
2 pages
Pages 17-20
No ratings yet
Pages 17-20
4 pages
FFNN, GD, Backpropagation
No ratings yet
FFNN, GD, Backpropagation
18 pages
6 - Tips For Training Deep Neural Networks
No ratings yet
6 - Tips For Training Deep Neural Networks
59 pages
Unit 4 Short Notes
No ratings yet
Unit 4 Short Notes
27 pages
03 - Supervised Learning (BPNN)
No ratings yet
03 - Supervised Learning (BPNN)
14 pages
BDA Unit 2
No ratings yet
BDA Unit 2
48 pages
Shortnotedeeplearning
No ratings yet
Shortnotedeeplearning
11 pages
DL Module 2
No ratings yet
DL Module 2
8 pages
Wa0006.
No ratings yet
Wa0006.
70 pages
Unit Online 1.3
No ratings yet
Unit Online 1.3
21 pages
Training Neural Netwok: Data Set
No ratings yet
Training Neural Netwok: Data Set
35 pages
Cst414-Deep Learning Module 2
No ratings yet
Cst414-Deep Learning Module 2
13 pages
Asymmetric Information: Theory and Applications
No ratings yet
Asymmetric Information: Theory and Applications
35 pages
The 4 Unique Buying Styles
100% (1)
The 4 Unique Buying Styles
4 pages
Judiciary Handbook
No ratings yet
Judiciary Handbook
5 pages
Manual Ecoaire, Eco Insert, Eco2 & Kerala 2012 ENG
No ratings yet
Manual Ecoaire, Eco Insert, Eco2 & Kerala 2012 ENG
52 pages
Package Desire': R Topics Documented
No ratings yet
Package Desire': R Topics Documented
22 pages
Admit Card
No ratings yet
Admit Card
3 pages
Fast Track Quick Reference
No ratings yet
Fast Track Quick Reference
7 pages
V003t07a004 88 GT 249
100% (1)
V003t07a004 88 GT 249
12 pages
07820100024353
No ratings yet
07820100024353
20 pages
Unit 5 Review Answers
No ratings yet
Unit 5 Review Answers
17 pages
WHSmiths AR17 WEB 2017-10-25-Min-Compressed
No ratings yet
WHSmiths AR17 WEB 2017-10-25-Min-Compressed
114 pages
Spare Parts List: Forward and Reversible Plate
No ratings yet
Spare Parts List: Forward and Reversible Plate
44 pages
G9 DLL Q1 Week4
No ratings yet
G9 DLL Q1 Week4
3 pages
HW8-smoother Tuning DIAL
100% (1)
HW8-smoother Tuning DIAL
5 pages
Summit Evolution™: World-Class Digital Photogrammetric Workstation
No ratings yet
Summit Evolution™: World-Class Digital Photogrammetric Workstation
2 pages
Japan Accounting
No ratings yet
Japan Accounting
14 pages
Bryson Yee Resume 2018-2019 Updated
No ratings yet
Bryson Yee Resume 2018-2019 Updated
2 pages
Avaya 9641GS IP Deskphone: Phones & Devices
No ratings yet
Avaya 9641GS IP Deskphone: Phones & Devices
4 pages
MaterialsTodayProceedings 1
No ratings yet
MaterialsTodayProceedings 1
9 pages
Francisco Padilla 1
No ratings yet
Francisco Padilla 1
2 pages
Cryptoasset Registration Flowchart
No ratings yet
Cryptoasset Registration Flowchart
1 page
Lab Assignment 2
No ratings yet
Lab Assignment 2
7 pages
Exec Order On PCART
No ratings yet
Exec Order On PCART
5 pages
Social Security and Health Rights of Migrant Workers in India
No ratings yet
Social Security and Health Rights of Migrant Workers in India
4 pages
Peta1 Q1
No ratings yet
Peta1 Q1
2 pages
Island Agriculture Assessment - TOR
No ratings yet
Island Agriculture Assessment - TOR
2 pages
Certificate of Analysis: Product: ACCESS Prolactin Calibrators
No ratings yet
Certificate of Analysis: Product: ACCESS Prolactin Calibrators
1 page
Maritime Sewip Datasheet
No ratings yet
Maritime Sewip Datasheet
2 pages
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet

UNIT 2 Notes

Uploaded by

UNIT 2 Notes

Uploaded by

UNIT 2 Notes

History of Deep Learning- A Probabilistic Theory of Deep Learning- Backpropagation and

History of Deep Learning [DL]:

 Probability is the science of quantifying uncertain things.

 BPN is a Multilayer Feedforward Network but error is back propagated, Hence

III. Backpropagation of Errors

IV. Updating of Weights & Biases

• Has a systematic weight updating procedure

L2Regularization of an estimator works by trading increased bias for reduced variance.

 L1 regularization attempts to estimate the median of data, L2 regularization

Layers in a Convolutional Neural Network

4. Fully connected layer

5. ReLU layer/ Activation Layer

Here’s how exactly CNN recognizes a bird:

 It results in a convolved map

Generative Adversarial networks

o Discriminator: It is used as a supervised machine learning approach in which a simple

You might also like