0% found this document useful (0 votes)

25 views24 pages

Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture

The document discusses the architecture and training of artificial neural networks. It covers topics like feedforward networks, backpropagation, activation functions, optimizers and regularization techniques for deep learning. It also discusses training algorithms for single layer and multi-layer neural networks.

Uploaded by

Anitha Saravanan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views24 pages

Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture

Uploaded by

Anitha Saravanan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Module 3

Deep Learning 10 Hours

• Artificial Neural Networks (ANN): architecture

• Feed-forward and back propagation

• Activation functions

• Optimizers in deep learning

• Regularization techniques

• Recurrent neural networks

• Transfer learning

19-03-2024 2
Single-Layer Neural Networks

Bias

3
Training of a Single-Layer Neural Network: Delta Rule

di is the correct output of the output node i.

The learning rate, α, determines how much the weight is

changed per time.

4
Training of a Single-Layer Neural Network: Delta Rule

Updated weights

• A single-layer neural network with

three input nodes and one output
node
• the weight between the input node 2
and output node 1 is denoted as w12.

5
Training process using the delta rule for the single-layer
neural network-“Supervised Learning of a Neural Network”

Epoch - the number of training iterations, in

each of which all training data goes through
Steps 2-5 once, is called an epoch.

6
Generalized Delta Rule
• For an arbitrary activation function, the delta rule is expressed as

• For linear activation function of Ⴔ(x ) = x

• The derivative of this function is Ⴔ’(x ) = 1.

7
Delta rule with the sigmoid function

The output from a sigmoid function is within

the range of 0-1.
This behavior of the sigmoid function is useful
when the neural network produces probability
outputs.

Delta rule for the sigmoid function

Derivative of the sigmoid function

8
Calculation of weight updates

• Stochastic Gradient Descent (SGD)

• Batch
• Mini Batch

9
Stochastic Gradient Descent

• The Stochastic Gradient Descent (SGD)

calculates the error for each training data
and adjusts the weights immediately. If we
have 100 training data points, the SGD
adjusts the weights 100 times.
• The SGD calculates the weight updates as:

10
Batch
• Each weight update is calculated
for all errors of the training data,
and the average of the weight
updates is used for adjusting the
weights.

• This method uses all of the training

data and updates only once.

where Δwij(k) is the weight update for the k -th

training data
N is the total number of the training data. 11
Mini Batch
• Because of the averaged weight update calculation, the batch
method consumes a significant amount of time for training.
• The mini batch method is a blend of the SGD and batch
methods. It selects a part of the training dataset and uses
them for training in the batch method.
• Therefore, it calculates the weight updates of the selected
data and trains the neural network with the averaged weight
update. speed from the SGD
and
• For example, if 20 arbitrary data points are selected out of stability from the batch
100 training data points, the batch method is applied to the
20 data points.
• In this case, a total of five weight adjustments are performed
to complete the training process for all the data points (5 =
100/20).
12
Limitations of single layer NN

• The single-layer neural network can only solve linearly separable problems.

• This is because the single-layer neural network is a model that linearly divides
the input data space.

• In order to overcome this limitation of the single-layer neural network, more

layers are needed in the network.

• This need has led to the appearance of the multi-layer neural network

13
Artificial Neural Network Architecture

14
Feed-Forward Neural Networks
• A collection of neurons connected together in a network can be represented by a directed graph.
• Nodes represent the neurons, and arrows represent the links between them.
• Each node has its number, and a link connecting two nodes will have a pair of numbers (e.g. (1, 4) connecting
nodes 1 and 4).
• Networks without cycles (feedback loops) are called a feed-forward networks (or perceptron)
• Input nodes of the network (nodes 1, 2 and 3) are associated with the input variables (x1, . . . , xm). They do not
compute anything, but simply pass the values to the processing nodes.
• Output nodes (12 and 13) are associated with the output variables (y1, . ..yn).
• Neural networks can have several hidden layers.
4 8
• The signal flows only into one direction (from
1
the inputs to the outputs)
Feed Forward Neural Networks can be used 5 9 12
for classification and in unsupervised learning 2
as auto-encoders. 6 10 13
3
N-layer neural network:
One N − 1 layers of hidden units 7 11
One output layer 15
Number of Neurons In Input and Output Layers

• The number of neurons in the input layer is equal to the number of features in
the data and in very rare cases, there will be one input layer for bias.

• Whereas the number of neurons in the output depends on whether is the

model is used as a regressor or classifier.

• If the model is a regressor, then the output layer will have only a single neuron.

• In case if the model is a classifier, it will have a single neuron or multiple neurons
depending on the class label of the model.

16
Number of Neurons in Hidden Layer

• The number of hidden neurons should be between the size of the input layer and the size of the
output layer.
• The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the
output layer.
• The number of hidden neurons should be less than twice the size of the input layer.
• Most of the problems can be solved by using a single hidden layer with the number of neurons
equal to the mean of the input and output layer.
• If less number of neurons is chosen it will lead to underfitting and high statistical bias.
• Whereas if we choose too many neurons it may lead to overfitting, high variance, and increases
the time it takes to train the network.

17
Training of Multi-Layer Neural Network

• Back-propagation algorithm - solved the training problem of the multi-layer neural network.
• Significance of the back-propagation algorithm - it provided a systematic method to determine
the error of the hidden nodes.
• Once the hidden layer errors are determined, the delta rule is applied to adjust the weights.
• In the back-propagation algorithm, the output error starts from the output layer and moves
backward until it reaches the right next hidden layer to the input layer. This process is called
backpropagation, as it resembles an output error propagating backward.
• Even in back-propagation, the signal still flows through the connecting lines and the weights are
multiplied. The only difference is that the input and output signals flow in opposite directions.

18
Back-propagation algorithm

The output error starts from the output layer and moves backward
until it reaches the right next hidden layer to the input layer.
19
Back-propagation algorithm
• Neural network that consists of two
nodes for the input and output and a
hidden layer, which has two nodes.

Weighted sum of the hidden node is Weighted sum of the output nodes is

Output from the hidden nodes

Output from the neural network

20
Back-propagation algorithm
Train the neural network using the back-propagation algorithm

In the back-propagation algorithm, the delta

of the output node is defined identically to
the delta rule of the “Generalized Delta Rule”.

φ’(.) is the derivative of the activation function of the output node

yi is the output from the output node
di is the correct output from the training data
vi is the weighted sum of the corresponding node
Back-propagation algorithm
Train the neural network using the back-propagation algorithm
In the back-propagation
algorithm, the error of the
node is defined as the
weighted sum of the back-
propagated deltas from the
layer on the immediate right
(in this case, the output layer).
The error of the hidden node
is calculated as the backward
weighted sum of the delta,
Proceed leftward to the hidden nodes and calculate the delta and the delta of the node is
the product of the error and
the derivative of the
activation function. This
process begins at the output
(𝟏) (𝟏)
𝒗𝟏 and 𝒗𝟐 are the weight sums of the forward signals at the respective nodes. layer and repeats for all
22
hidden layers.
Back-propagation algorithm
Error calculation

To adjust the weights of the respective layers

xj is the input signal for the corresponding weight.

23
Back-propagation algorithm

24
Back propagation algorithm
1. Initialize the weights with adequate values.
2. Enter the input from the training data { input, correct output } and obtain the neural network’s output.
Calculate the error of the output to the correct output and the delta, δ, of the output nodes.

3. Propagate the output node delta, δ, backward, and calculate the deltas of the immediate next (left) nodes.

4. Repeat Step 3 until it reaches the hidden layer that is on the immediate right of the input layer.
5. Adjust the weights according to the following learning rule.

6. Repeat Steps 2-5 for every training data point.

7. Repeat Steps 2-6 until the neural network is properly trained.

Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
From Everand
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
Fouad Sabry
No ratings yet
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
From Everand
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
Fouad Sabry
No ratings yet
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
The Method of The Real-Time Human Detection and Tracking: ISSN 2710 - 1673 Artificial Intelligence 2023 1
No ratings yet
The Method of The Real-Time Human Detection and Tracking: ISSN 2710 - 1673 Artificial Intelligence 2023 1
8 pages
In Exercises 1-4, Simplify The Expression
No ratings yet
In Exercises 1-4, Simplify The Expression
3 pages
T05 Achyut
No ratings yet
T05 Achyut
4 pages
Stability and Root Locus
No ratings yet
Stability and Root Locus
7 pages
Day 3 Intermediate Final
No ratings yet
Day 3 Intermediate Final
4 pages
Monte Carlo Simulation Handouts
No ratings yet
Monte Carlo Simulation Handouts
8 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
2024 Module Test 2 - 2
No ratings yet
2024 Module Test 2 - 2
6 pages
02 Nodeemb
No ratings yet
02 Nodeemb
71 pages
Amazon: Exam Questions AWS-Certified-Machine-Learning-Specialty
No ratings yet
Amazon: Exam Questions AWS-Certified-Machine-Learning-Specialty
16 pages
Unit-3 Divide and Concur
No ratings yet
Unit-3 Divide and Concur
78 pages
Pengaruh Digitalisasi Terhadap Efektivitas Pelayanan Bank (Studi Pada Nasabah Pengguna M-Din Muamalat) - Kelompok 5 - Rizki Nurdiana
No ratings yet
Pengaruh Digitalisasi Terhadap Efektivitas Pelayanan Bank (Studi Pada Nasabah Pengguna M-Din Muamalat) - Kelompok 5 - Rizki Nurdiana
7 pages
ML Unit-5
No ratings yet
ML Unit-5
19 pages
Unit-4 Full
No ratings yet
Unit-4 Full
48 pages
DSA - B - Tree
No ratings yet
DSA - B - Tree
19 pages
Excel For Data Analysis
No ratings yet
Excel For Data Analysis
14 pages
NN Introduction MES
No ratings yet
NN Introduction MES
39 pages
Unit - 4 ANN
No ratings yet
Unit - 4 ANN
46 pages
Simple Master Equations For Driven Systems Subject To Non Markovian Classical Noise
No ratings yet
Simple Master Equations For Driven Systems Subject To Non Markovian Classical Noise
18 pages
3ML.05.NeuralNetworks DeepLearning
No ratings yet
3ML.05.NeuralNetworks DeepLearning
67 pages
L6 Neural Network
No ratings yet
L6 Neural Network
57 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
ITEC4433 - Data Warehousing and Data Mining
No ratings yet
ITEC4433 - Data Warehousing and Data Mining
3 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
31 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Ann 2 A
No ratings yet
Ann 2 A
20 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Lect8 DNN
No ratings yet
Lect8 DNN
33 pages
Lecture 02 - Artificial Neural Network
No ratings yet
Lecture 02 - Artificial Neural Network
37 pages
Linear Regression Analysis
No ratings yet
Linear Regression Analysis
7 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
Module 5 Lecture 2
No ratings yet
Module 5 Lecture 2
45 pages
School of Advanced Sciences MAT5005: Advanced Mathematical Methods Question Bank
No ratings yet
School of Advanced Sciences MAT5005: Advanced Mathematical Methods Question Bank
2 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
75 pages
IEM 4103 Quality Control & Reliability Analysis IEM 5103 Breakthrough Quality & Reliability
No ratings yet
IEM 4103 Quality Control & Reliability Analysis IEM 5103 Breakthrough Quality & Reliability
42 pages
AI17-Neural Networks
No ratings yet
AI17-Neural Networks
34 pages
AIML-Module-3-part 2
No ratings yet
AIML-Module-3-part 2
122 pages
A Branch-and-Bound Algorithm For The Knapsack Problem With Conflict Graph
No ratings yet
A Branch-and-Bound Algorithm For The Knapsack Problem With Conflict Graph
24 pages
Facebook Profiles Clustering
No ratings yet
Facebook Profiles Clustering
5 pages
Unit 1
No ratings yet
Unit 1
19 pages
Chapter 3
No ratings yet
Chapter 3
30 pages
An Overview of Turbo Codes and Their Applications: November 2005
No ratings yet
An Overview of Turbo Codes and Their Applications: November 2005
11 pages
Roulette Wheel
No ratings yet
Roulette Wheel
8 pages
Ia Davma Unidad 2
No ratings yet
Ia Davma Unidad 2
113 pages
Neural Networks
No ratings yet
Neural Networks
27 pages
JFJF
No ratings yet
JFJF
14 pages
UNIT 3 - Backpropagation Algorithm
No ratings yet
UNIT 3 - Backpropagation Algorithm
38 pages
Lecture 7 - Neural Networks
No ratings yet
Lecture 7 - Neural Networks
48 pages
09-Neural Networks
No ratings yet
09-Neural Networks
18 pages
Section 5. Graphing Systems: 5A. The Phase Plane
No ratings yet
Section 5. Graphing Systems: 5A. The Phase Plane
5 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
VHDL Implementation of 128 Bit Pipelined Blowfish Algorithm
No ratings yet
VHDL Implementation of 128 Bit Pipelined Blowfish Algorithm
5 pages
Unit 4
No ratings yet
Unit 4
38 pages
Backpropagation Example
No ratings yet
Backpropagation Example
9 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Question Bank - DC 6501
No ratings yet
Question Bank - DC 6501
10 pages
Chapter 6 AI
No ratings yet
Chapter 6 AI
52 pages
Lecture 13.3 Classification ANN
No ratings yet
Lecture 13.3 Classification ANN
64 pages
Dde 23
No ratings yet
Dde 23
3 pages
Markov Chain Algorithm in Java
No ratings yet
Markov Chain Algorithm in Java
7 pages
Unit 1
No ratings yet
Unit 1
20 pages
Week10 (Backprop and Competitive)
No ratings yet
Week10 (Backprop and Competitive)
63 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
15 pages
Tính Toán Phân Tán
No ratings yet
Tính Toán Phân Tán
79 pages
ANN Doc
No ratings yet
ANN Doc
2 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Maximization Problem
100% (2)
Maximization Problem
10 pages
Chapter3 - BP
No ratings yet
Chapter3 - BP
12 pages
Back Propagation
No ratings yet
Back Propagation
20 pages
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Back Propagation Algorithm
No ratings yet
Back Propagation Algorithm
13 pages
Unit II Supervised II
No ratings yet
Unit II Supervised II
16 pages
Back Propagation Technique
No ratings yet
Back Propagation Technique
24 pages
Feedforward
No ratings yet
Feedforward
34 pages
Back Propagation
No ratings yet
Back Propagation
56 pages
Machine Learning With Artificial Neural Networks
No ratings yet
Machine Learning With Artificial Neural Networks
44 pages
Understanding and Coding Neural Networks From Scratch in Python and R
100% (1)
Understanding and Coding Neural Networks From Scratch in Python and R
15 pages
19 - Introduction To Neural Networks
No ratings yet
19 - Introduction To Neural Networks
7 pages
Unit 4
No ratings yet
Unit 4
16 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Back Propagation Learning Algorithm
No ratings yet
Back Propagation Learning Algorithm
15 pages
Artificial Neural Networks - MLP
No ratings yet
Artificial Neural Networks - MLP
52 pages
2012-1158. Backpropagation NN
No ratings yet
2012-1158. Backpropagation NN
56 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages

Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture

Uploaded by

Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture

Uploaded by

Module 3

Deep Learning 10 Hours

• Artificial Neural Networks (ANN): architecture

• Feed-forward and back propagation

• Optimizers in deep learning

• Recurrent neural networks

di is the correct output of the output node i.

The learning rate, α, determines how much the weight is

• A single-layer neural network with

Epoch - the number of training iterations, in

• For linear activation function of Ⴔ(x ) = x

The output from a sigmoid function is within

Delta rule for the sigmoid function

Derivative of the sigmoid function

• Stochastic Gradient Descent (SGD)

• The Stochastic Gradient Descent (SGD)

• This method uses all of the training

where Δwij(k) is the weight update for the k -th

• In order to overcome this limitation of the single-layer neural network, more

• Whereas the number of neurons in the output depends on whether is the

Output from the hidden nodes

In the back-propagation algorithm, the delta

φ’(.) is the derivative of the activation function of the output node

To adjust the weights of the respective layers

xj is the input signal for the corresponding weight.

6. Repeat Steps 2-5 for every training data point.

You might also like