0% found this document useful (0 votes)

423 views18 pages

Module 2 Deep Feed Forward Networks

HKKJ

Uploaded by

Rajeshwari R P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

423 views18 pages

Module 2 Deep Feed Forward Networks

HKKJ

Uploaded by

Rajeshwari R P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Module 2 Deep Feed Forward Networks

Deep Feed Forward Networks

Deep feedforward networks, also called feedforward neural networks, or multilayer

perceptrons (MLPs), are the classic deep learning models. The goal of a feed-forward
network is to approximate some function f*. For example, for a classifier, y = f*(x) maps
an input x to a category y.

A feed forward network defines a mapping y = f (x; θ) and learns the value of the
parameters θ that result in the8 best function approximation .These models are called
feedforward because information flows through the function being evaluated from x,
through the intermediate computations used to define f , and finally to the output y.

There are no feedback connections in which outputs of the model are fed back into itself.
When feed forward neural networks are extended to include feedback connections, they
are called recurrent neural networks

Feed forward networks are of extreme importance to machine learning practitioners.

They form the basis of many important commercial applications.
ex: Object detection ,Classification, Segmentation tasks etc.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 1

Module 2 Deep Feed Forward Networks

φ provides a set of features describing x, or as providing a new

representation for x.

6.1 Example: Learning XOR

The XOR (exclusive or) function is a logical operation on two binary inputs, x1 and x2, where the output
is 1 if one, and only one, of the inputs is 1; otherwise, the output is 0. This function is challenging for
linear models because XOR is not linearly separable. To solve the XOR problem with a neural network,
we need a non-linear model. Here's a breakdown of how a simple feed-forward neural network with one
hidden layer can be used to solve this problem.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 2

Module 2 Deep Feed Forward Networks

Figure 6.1: Solving the XOR problem by learning a representation.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 3

Module 2 Deep Feed Forward Networks

Figure 6.3: The rectified linear activation function. This activation function is the default activation
function recommended for use with most feed-forward neural networks.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 4

Module 2 Deep Feed Forward Networks

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 5

Module 2 Deep Feed Forward Networks

6.2 Gradient-Based Learning

Gradient-based learning is the backbone of many deep learning algorithms. This approach involves
iteratively adjusting model parameters to minimize the loss function, which measures the difference
between the actual and predicted outputs. At its core, Gradient-based learning leverages the gradient of
the loss function to navigate the complex landscape of parameters

Gradient

 The gradient is a vector that points in the direction of the steepest increase of a
function. In the context of machine learning, it shows how to adjust the model's
parameters to increase or decrease the loss.
 A positive gradient indicates that increasing the parameter will increase the loss, while a
negative gradient suggests decreasing the parameter will reduce the loss.

Gradient Descent

 Gradient Descent is an optimization algorithm used to minimize the loss function. The
algorithm updates the model’s parameters in the opposite direction of the gradient.
 The size of the steps taken in the direction of the negative gradient is determined by a
hyper-parameter called the learning rate. A smaller learning rate makes more precise
but slower updates, while a larger learning rate speeds up learning but may cause
instability.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 6

Module 2 Deep Feed Forward Networks

Challenges in Gradient-Based Learning

 Non-convexity: Neural networks typically have non-convex loss functions, meaning

there are many local minima and saddle points. The optimizer might converge to a
suboptimal solution.
 Vanishing/Exploding Gradients: In deep neural networks, gradients can become very
small (vanishing gradients) or very large (exploding gradients), making it difficult for the
network to learn.
 Overfitting: If the model is too complex or trained for too long, it might fit the training
data too closely and fail to generalize to new, unseen data

Choices for gradient learning

• choose a cost function

• how to represent the output of the model

1. Cost function wrt Gradient Based Learning

The choice of a cost function directly impacts how well the network learns from data. For most
cases, neural networks use cost functions based on the principle of maximum likelihood, where
the objective is to minimize the difference between the predicted probability distribution of the
network and the actual data distribution, typically using cross-entropy as the cost function.
1.1 Learning Conditional Distributions with Maximum Likelihood
 Cost Functions and Maximum Likelihood: The most common approach to designing
cost functions in neural networks is through maximum likelihood estimation. The cost
function is often defined as the negative log-likelihood, which is mathematically
expressed as the cross-

This approach derives cost functions from the model itself, removing the need for manually
designing them for each model.

 Mean Squared Error (MSE) and Gaussian Distribution: For cases where the model
assumes a normal distribution (Gaussian) for predicting the outputs, the cost function
reduces to the mean squared error (MSE):

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 7

Module 2 Deep Feed Forward Networks

Here, the MSE cost function is linked to the maximum likelihood estimation for models
predicting a Gaussian distribution with the mean being f(x;θ)f(x; \theta)f(x;θ).

1.2 Learning Conditional Statistics

Mathematical Expression for learning a conditional statistic, specifically for minimizing the mean
squared error (MSE) between the predicted value and the actual value of y. The goal is to find the
∗
function f (x) that minimizes the expectation of the squared error:

f∗: The optimal function that minimizes the mean squared error.
2. Output Units wrt Gradient Based Learning

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 8

Module 2 Deep Feed Forward Networks

2.1 Linear Units for Gaussian Output Distributions

Cost Function: Minimizing the negative log-likelihood of this Gaussian results in the mean squared error
(MSE) cost function.

2.2 Sigmoid Units for Bernoulli Output Distributions:

2.3 Soft-Max Units for Multi-noulli Output Distributions

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 9

Module 2 Deep Feed Forward Networks

3. Hidden Units
The design of hidden units is an extremely important. Rectified linear units are an excellent
default choice of hidden unit.
3.1 Rectified Linear Units (ReLU):

RELU units use the function g(z)=max(0,z)

 ReLUs are easy to optimize because their gradients are large and consistent when active
 This property makes them less prone to vanishing gradients
 ReLUs are well-suited for deep learning models as they maintain strong gradients and
are computationally efficient.
3.2 Sigmoid and Hyperbolic Tangent (Tanh)
most neural networks used the logistic sigmoid activation function
 Sigmoid activation function: g(z)=σ(z) where σ(z) is the logistic function.
 Tanh activation function: g(z)=tanh(z)
 Tanh typically performs better than sigmoid because it outputs values centered at zero,
reducing the effect of saturated outputs.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 10

Module 2 Deep Feed Forward Networks

Back-Propagation and Other Differentiation Algorithms:

Feed-forward Propagation:

 In a feed-forward neural network, the input x is passed through multiple layers of the
network until an output y^ is produced. This process of information moving through the
network, from input to output, is called forward propagation.
 Forward propagation continues until it produces the final output, and during training, it
continues onward to compute a scalar cost, which measures how far the network’s
output y^ is from the true output y.

Back-propagation is a method for efficiently calculating the gradient of the cost function J(θ)
with respect to the model's parameters θ.
While back-propagation itself only refers to the gradient computation, the actual learning
process involves an optimization algorithm, like stochastic gradient descent (SGD),that uses the
gradients computed by back-propagation to update the model’s parameters.

Forward propagation moves inputs through the network to produce outputs, and back-propagation
helps in computing the gradients needed for optimizing the model’s parameters

The computational graph and Chain rule are crucial in the computation of gradients.

computational graphs provide a more formal and structured way to visualize and describe the flow of
computations.

1. Computational Graph

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 11

Module 2 Deep Feed Forward Networks

Chain Rule:
1. Let x be a real number, and let f and g both be functions mapping from a real number to a
real number. Suppose that y = g(x) and z = f(g(x)) = f (y).
Then the chain rule states that

2. For Vectors:

and

The chain rule in the scalar case can be seen as multiplying simple
derivatives. In the vector case, it becomes a matrix multiplication of Jacobians.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 12

Module 2 Deep Feed Forward Networks

Algorithm 6.1: Algorithm for Computational graph construction.

Algorithm 6.2: Algorithm to compute derivatives

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 13

Module 2 Deep Feed Forward Networks

Algorithm 6.3: Algorithm to Cost function

Algorithm 6.3 : Forward propagation through a typical deep neural network and the computation of the
cost function. The loss L(yˆ, y) depends on the output yˆ and on the target y

To obtain the total cost J, the loss may be added to a regularizer Ω(θ ), where θ contains all the
parameters (weights and biases).

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 14

Module 2 Deep Feed Forward Networks

Algorithm 6.4: Algorithm FOR Backward Computation

Backward computation for the deep neural network of algorithm 6.3, which uses in addition to the input
x a target y

Explanation of algorithm 6.4

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 15

Module 2 Deep Feed Forward Networks

6 General Back-Propagation Algorithm

Algorithm 6.5 The outermost skeleton of the back-propagation
algorithm

This Algorithm uses 6.1 to 6.4 to compute necessary components

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 16

Module 2 Deep Feed Forward Networks

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 17

Module 2 Deep Feed Forward Networks

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 18

Dbms m5 Module 5 Notes of Dbms Vtu
No ratings yet
Dbms m5 Module 5 Notes of Dbms Vtu
49 pages
Module Two The Best Book To Read
No ratings yet
Module Two The Best Book To Read
65 pages
Digital System Design Using Verilog
0% (1)
Digital System Design Using Verilog
3 pages
Hospital Management System Project Report
No ratings yet
Hospital Management System Project Report
87 pages
Lecturernotes - Module - 3 - BCS515D - Distributed Systems
No ratings yet
Lecturernotes - Module - 3 - BCS515D - Distributed Systems
11 pages
Cisco Live Introduction To SRv6 uSID Technology-2
No ratings yet
Cisco Live Introduction To SRv6 uSID Technology-2
129 pages
Compiler Design: Prof. Santanu Chattopadhyay
0% (1)
Compiler Design: Prof. Santanu Chattopadhyay
1 page
MachineLearningNotes PDF
100% (1)
MachineLearningNotes PDF
299 pages
7th Sem Updated Lab Manual
No ratings yet
7th Sem Updated Lab Manual
14 pages
Practical Notes PDF
67% (6)
Practical Notes PDF
42 pages
21AI644
No ratings yet
21AI644
10 pages
Service Manual: Active Subwoofer
No ratings yet
Service Manual: Active Subwoofer
30 pages
9.deep Feedforward Networks
100% (1)
9.deep Feedforward Networks
13 pages
hEALTH CARE ANALYTICS
No ratings yet
hEALTH CARE ANALYTICS
2 pages
21CS644 Module 4
No ratings yet
21CS644 Module 4
24 pages
CSE 6th Semester - Web Application Security - CCS374 - Question Bank and Important 2 Marks Questions With Answer
No ratings yet
CSE 6th Semester - Web Application Security - CCS374 - Question Bank and Important 2 Marks Questions With Answer
22 pages
Iot Betck105h Notes
No ratings yet
Iot Betck105h Notes
110 pages
Sonata Software Sample Aptitude Placement Paper Level1
No ratings yet
Sonata Software Sample Aptitude Placement Paper Level1
7 pages
DV LAB-5th Sem (2022)
No ratings yet
DV LAB-5th Sem (2022)
92 pages
Module 2: Divide and Conquer: Design and Analysis of Algorithms 18CS42
No ratings yet
Module 2: Divide and Conquer: Design and Analysis of Algorithms 18CS42
82 pages
Jawaharlal Nehru Engineering College: Digital Image Processing
50% (2)
Jawaharlal Nehru Engineering College: Digital Image Processing
26 pages
VTU Exam Question Paper With Solution of BCS302 Digital Design and Computer Organization April-2024-Dr - Ciyamala Kushbu S
No ratings yet
VTU Exam Question Paper With Solution of BCS302 Digital Design and Computer Organization April-2024-Dr - Ciyamala Kushbu S
4 pages
Model Question Paper II - 21cs642 - 6 Sem (2021 Scheme)
No ratings yet
Model Question Paper II - 21cs642 - 6 Sem (2021 Scheme)
2 pages
Toc Mod 5 Notes
No ratings yet
Toc Mod 5 Notes
41 pages
Ddco Notes
No ratings yet
Ddco Notes
40 pages
25th August MCA New First Year Syllabus 2020
No ratings yet
25th August MCA New First Year Syllabus 2020
24 pages
Module 3 AI BAD402
100% (1)
Module 3 AI BAD402
15 pages
Characterization of Distributed Systems Ds Module1
No ratings yet
Characterization of Distributed Systems Ds Module1
23 pages
@vtucode - in 21CS63 Question Bank 2021 Scheme
No ratings yet
@vtucode - in 21CS63 Question Bank 2021 Scheme
6 pages
CS01207
No ratings yet
CS01207
3 pages
Model Paper BCS306C
No ratings yet
Model Paper BCS306C
21 pages
DBMS Module4 QuestionBank
No ratings yet
DBMS Module4 QuestionBank
2 pages
58 Csesch
No ratings yet
58 Csesch
12 pages
Introduction To Big data-21CS753-syllabus
No ratings yet
Introduction To Big data-21CS753-syllabus
3 pages
Chap - 4: Screen Designing: Visually Pleasing Composition
No ratings yet
Chap - 4: Screen Designing: Visually Pleasing Composition
23 pages
Advanced JAVA
No ratings yet
Advanced JAVA
4 pages
Example Sky Airtemp Humidity Wind Water Forecast Enjoysport 1 2 3 4
No ratings yet
Example Sky Airtemp Humidity Wind Water Forecast Enjoysport 1 2 3 4
6 pages
Machine Learning and Deep Learning Approaches For CyberSecurity A Review
No ratings yet
Machine Learning and Deep Learning Approaches For CyberSecurity A Review
14 pages
Question Bank - Module 2 - Module-3 Module 4 - Module 5
No ratings yet
Question Bank - Module 2 - Module-3 Module 4 - Module 5
4 pages
Important Questions of Machine Learning
No ratings yet
Important Questions of Machine Learning
5 pages
H 046 010879 00 BeneVision CMS Operators Manual R3 9.0
No ratings yet
H 046 010879 00 BeneVision CMS Operators Manual R3 9.0
182 pages
Sangida Akter (ID-11608019) - Hill Cipher
No ratings yet
Sangida Akter (ID-11608019) - Hill Cipher
129 pages
Nse4 7.2 April 2023 0021655255099
No ratings yet
Nse4 7.2 April 2023 0021655255099
193 pages
TCS NQT Online Test Pattern: Section Order Section # Qs Duration (Minutes)
No ratings yet
TCS NQT Online Test Pattern: Section Order Section # Qs Duration (Minutes)
34 pages
3rd Year Syllabus 2020-21
No ratings yet
3rd Year Syllabus 2020-21
36 pages
Solving. Data Structures.: Instructions To The Students: Course Outcome
0% (1)
Solving. Data Structures.: Instructions To The Students: Course Outcome
1 page
DAA Lab Manual (New Format)
No ratings yet
DAA Lab Manual (New Format)
41 pages
2marks With Answer
No ratings yet
2marks With Answer
46 pages
A Dot Matrix Printer
No ratings yet
A Dot Matrix Printer
21 pages
Mental Health Report
No ratings yet
Mental Health Report
15 pages
Software Quality: Robert Hughes and Mike Cotterell
No ratings yet
Software Quality: Robert Hughes and Mike Cotterell
46 pages
DL 2
No ratings yet
DL 2
62 pages
Mdule 2 Cgupdated
No ratings yet
Mdule 2 Cgupdated
70 pages
18CS42 Model Question Paper - 1 With Effect From 2019-20 (CBCS Scheme)
No ratings yet
18CS42 Model Question Paper - 1 With Effect From 2019-20 (CBCS Scheme)
3 pages
DIP Lab Manual Final
No ratings yet
DIP Lab Manual Final
31 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Java - Lab - Manual-21csl35 - Skit
No ratings yet
Java - Lab - Manual-21csl35 - Skit
30 pages
Module 2 DL Snotes P1
No ratings yet
Module 2 DL Snotes P1
16 pages
Geo SCADA Expert Design Guidelines v3.0 - EN - 2021
No ratings yet
Geo SCADA Expert Design Guidelines v3.0 - EN - 2021
27 pages
Output Log 2022-04-23 23-38-53
No ratings yet
Output Log 2022-04-23 23-38-53
14 pages
21CS53 DBMS Module3 QuestionBank 2023-24
No ratings yet
21CS53 DBMS Module3 QuestionBank 2023-24
3 pages
Curved Scene Text Detection Based On Mask R-CNN
No ratings yet
Curved Scene Text Detection Based On Mask R-CNN
13 pages
Git Document
No ratings yet
Git Document
14 pages
Report Content
No ratings yet
Report Content
29 pages
6-Port Antenna Frequency Range Dual Polarization HPBW Adjust. Electr. DT
No ratings yet
6-Port Antenna Frequency Range Dual Polarization HPBW Adjust. Electr. DT
9 pages
DCDR Question Bank
No ratings yet
DCDR Question Bank
4 pages
Welcome To My Presentation: Women Safety Device Based On Iot Prepared By: Id# 171207
No ratings yet
Welcome To My Presentation: Women Safety Device Based On Iot Prepared By: Id# 171207
15 pages
Unit I Introduction To DevOps and The Culture
No ratings yet
Unit I Introduction To DevOps and The Culture
38 pages
Hybrid Decision Tree-Based Machine Learning Models For Short-Term Water Quality Prediction.
No ratings yet
Hybrid Decision Tree-Based Machine Learning Models For Short-Term Water Quality Prediction.
14 pages
MC - BCS402 Lab Manual
No ratings yet
MC - BCS402 Lab Manual
21 pages
Devops Lab Viva Questions
No ratings yet
Devops Lab Viva Questions
14 pages
ObjectDetectionPhase2 Demo
No ratings yet
ObjectDetectionPhase2 Demo
16 pages
Cse C-11
No ratings yet
Cse C-11
19 pages
Smart Sensors
No ratings yet
Smart Sensors
8 pages
18CS42 Model Question Paper-1 With Effect From 2019-20 (CBCS Scheme) Usn: Fourth Semester B.E. Degree Examination Design and Analysis of Algorithms
No ratings yet
18CS42 Model Question Paper-1 With Effect From 2019-20 (CBCS Scheme) Usn: Fourth Semester B.E. Degree Examination Design and Analysis of Algorithms
3 pages
Amazon 664 1490 Euro 3 Pallets 1
No ratings yet
Amazon 664 1490 Euro 3 Pallets 1
12 pages
Wide Enterprise Networking
No ratings yet
Wide Enterprise Networking
8 pages
cg20 21ia1
No ratings yet
cg20 21ia1
11 pages
IBM Aptitude Questions
No ratings yet
IBM Aptitude Questions
4 pages
Ôn HK1 - 3
No ratings yet
Ôn HK1 - 3
6 pages
M3 Machine Input Output - 161
No ratings yet
M3 Machine Input Output - 161
7 pages
Advance Java-BIS402
No ratings yet
Advance Java-BIS402
2 pages
ADA Model QP 2023-24
No ratings yet
ADA Model QP 2023-24
2 pages
ADA Important Questions PDF
No ratings yet
ADA Important Questions PDF
13 pages
Cie QP 2 - 21ai71
No ratings yet
Cie QP 2 - 21ai71
2 pages
RRL - Revision
No ratings yet
RRL - Revision
4 pages
Yerrijdnewpaper
No ratings yet
Yerrijdnewpaper
5 pages
Account Allocation Sheet
No ratings yet
Account Allocation Sheet
22 pages
Co 4&5 Questionbank SE23
No ratings yet
Co 4&5 Questionbank SE23
4 pages
Entegra
No ratings yet
Entegra
4 pages
Cyber Security Module 2 3
No ratings yet
Cyber Security Module 2 3
2 pages
9-Session Report
No ratings yet
9-Session Report
4 pages
AdvJava BIS402 - Module 2 3 - QuestionBank
No ratings yet
AdvJava BIS402 - Module 2 3 - QuestionBank
2 pages
18 - Course Exit Survey-C Sec CG 2021
No ratings yet
18 - Course Exit Survey-C Sec CG 2021
3 pages
5-Course Outcomes New
No ratings yet
5-Course Outcomes New
3 pages
19-SELF ASSESSMENT REPORT Os
No ratings yet
19-SELF ASSESSMENT REPORT Os
3 pages
Authorization Letter
No ratings yet
Authorization Letter
2 pages
GMT4000product Information
No ratings yet
GMT4000product Information
2 pages
C CV M Model
No ratings yet
C CV M Model
1 page
DP Go 3
No ratings yet
DP Go 3
2 pages
Programming in C: by Vishal Vanaki
No ratings yet
Programming in C: by Vishal Vanaki
35 pages

Module 2 Deep Feed Forward Networks

Uploaded by

Module 2 Deep Feed Forward Networks

Uploaded by

Module 2 Deep Feed Forward Networks

Deep Feed Forward Networks

Deep feedforward networks, also called feedforward neural networks, or multilayer

Feed forward networks are of extreme importance to machine learning practitioners.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 1

φ provides a set of features describing x, or as providing a new

6.1 Example: Learning XOR

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 2

Figure 6.1: Solving the XOR problem by learning a representation.

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 3

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 4

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 5

6.2 Gradient-Based Learning

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 6

Challenges in Gradient-Based Learning

 Non-convexity: Neural networks typically have non-convex loss functions, meaning

Choices for gradient learning

• choose a cost function

• how to represent the output of the model

1. Cost function wrt Gradient Based Learning

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 7

1.2 Learning Conditional Statistics

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 8

2.1 Linear Units for Gaussian Output Distributions

2.2 Sigmoid Units for Bernoulli Output Distributions:

2.3 Soft-Max Units for Multi-noulli Output Distributions

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 9

RELU units use the function g(z)=max(0,z)

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 10

Back-Propagation and Other Differentiation Algorithms:

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 11

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 12

Algorithm 6.1: Algorithm for Computational graph construction.

Algorithm 6.2: Algorithm to compute derivatives

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 13

Algorithm 6.3: Algorithm to Cost function

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 14

Algorithm 6.4: Algorithm FOR Backward Computation

Explanation of algorithm 6.4

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 15

6 General Back-Propagation Algorithm

This Algorithm uses 6.1 to 6.4 to compute necessary components

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 16

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 17

Rajeswari R P, Dept of CSE, RYMEC Deep Learning 21CS743 Page 18

You might also like