0% found this document useful (0 votes)
25 views24 pages

Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture

The document discusses the architecture and training of artificial neural networks. It covers topics like feedforward networks, backpropagation, activation functions, optimizers and regularization techniques for deep learning. It also discusses training algorithms for single layer and multi-layer neural networks.

Uploaded by

Anitha Saravanan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views24 pages

Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture

The document discusses the architecture and training of artificial neural networks. It covers topics like feedforward networks, backpropagation, activation functions, optimizers and regularization techniques for deep learning. It also discusses training algorithms for single layer and multi-layer neural networks.

Uploaded by

Anitha Saravanan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Module 3

Deep Learning 10 Hours

• Artificial Neural Networks (ANN): architecture

• Feed-forward and back propagation

• Activation functions

• Optimizers in deep learning

• Regularization techniques

• Recurrent neural networks

• Transfer learning

19-03-2024 2
Single-Layer Neural Networks

Bias

3
Training of a Single-Layer Neural Network: Delta Rule

di is the correct output of the output node i.

The learning rate, α, determines how much the weight is


changed per time.

4
Training of a Single-Layer Neural Network: Delta Rule

Updated weights

• A single-layer neural network with


three input nodes and one output
node
• the weight between the input node 2
and output node 1 is denoted as w12.

5
Training process using the delta rule for the single-layer
neural network-“Supervised Learning of a Neural Network”

Epoch - the number of training iterations, in


each of which all training data goes through
Steps 2-5 once, is called an epoch.

6
Generalized Delta Rule
• For an arbitrary activation function, the delta rule is expressed as

• For linear activation function of Ⴔ(x ) = x


• The derivative of this function is Ⴔ’(x ) = 1.

7
Delta rule with the sigmoid function

The output from a sigmoid function is within


the range of 0-1.
This behavior of the sigmoid function is useful
when the neural network produces probability
outputs.

Delta rule for the sigmoid function

Derivative of the sigmoid function

8
Calculation of weight updates

• Stochastic Gradient Descent (SGD)


• Batch
• Mini Batch

9
Stochastic Gradient Descent

• The Stochastic Gradient Descent (SGD)


calculates the error for each training data
and adjusts the weights immediately. If we
have 100 training data points, the SGD
adjusts the weights 100 times.
• The SGD calculates the weight updates as:

10
Batch
• Each weight update is calculated
for all errors of the training data,
and the average of the weight
updates is used for adjusting the
weights.

• This method uses all of the training


data and updates only once.

where Δwij(k) is the weight update for the k -th


training data
N is the total number of the training data. 11
Mini Batch
• Because of the averaged weight update calculation, the batch
method consumes a significant amount of time for training.
• The mini batch method is a blend of the SGD and batch
methods. It selects a part of the training dataset and uses
them for training in the batch method.
• Therefore, it calculates the weight updates of the selected
data and trains the neural network with the averaged weight
update. speed from the SGD
and
• For example, if 20 arbitrary data points are selected out of stability from the batch
100 training data points, the batch method is applied to the
20 data points.
• In this case, a total of five weight adjustments are performed
to complete the training process for all the data points (5 =
100/20).
12
Limitations of single layer NN

• The single-layer neural network can only solve linearly separable problems.

• This is because the single-layer neural network is a model that linearly divides
the input data space.

• In order to overcome this limitation of the single-layer neural network, more


layers are needed in the network.

• This need has led to the appearance of the multi-layer neural network

13
Artificial Neural Network Architecture

14
Feed-Forward Neural Networks
• A collection of neurons connected together in a network can be represented by a directed graph.
• Nodes represent the neurons, and arrows represent the links between them.
• Each node has its number, and a link connecting two nodes will have a pair of numbers (e.g. (1, 4) connecting
nodes 1 and 4).
• Networks without cycles (feedback loops) are called a feed-forward networks (or perceptron)
• Input nodes of the network (nodes 1, 2 and 3) are associated with the input variables (x1, . . . , xm). They do not
compute anything, but simply pass the values to the processing nodes.
• Output nodes (12 and 13) are associated with the output variables (y1, . ..yn).
• Neural networks can have several hidden layers.
4 8
• The signal flows only into one direction (from
1
the inputs to the outputs)
Feed Forward Neural Networks can be used 5 9 12
for classification and in unsupervised learning 2
as auto-encoders. 6 10 13
3
N-layer neural network:
One N − 1 layers of hidden units 7 11
One output layer 15
Number of Neurons In Input and Output Layers

• The number of neurons in the input layer is equal to the number of features in
the data and in very rare cases, there will be one input layer for bias.

• Whereas the number of neurons in the output depends on whether is the


model is used as a regressor or classifier.

• If the model is a regressor, then the output layer will have only a single neuron.

• In case if the model is a classifier, it will have a single neuron or multiple neurons
depending on the class label of the model.

16
Number of Neurons in Hidden Layer

• The number of hidden neurons should be between the size of the input layer and the size of the
output layer.
• The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the
output layer.
• The number of hidden neurons should be less than twice the size of the input layer.
• Most of the problems can be solved by using a single hidden layer with the number of neurons
equal to the mean of the input and output layer.
• If less number of neurons is chosen it will lead to underfitting and high statistical bias.
• Whereas if we choose too many neurons it may lead to overfitting, high variance, and increases
the time it takes to train the network.

17
Training of Multi-Layer Neural Network

• Back-propagation algorithm - solved the training problem of the multi-layer neural network.
• Significance of the back-propagation algorithm - it provided a systematic method to determine
the error of the hidden nodes.
• Once the hidden layer errors are determined, the delta rule is applied to adjust the weights.
• In the back-propagation algorithm, the output error starts from the output layer and moves
backward until it reaches the right next hidden layer to the input layer. This process is called
backpropagation, as it resembles an output error propagating backward.
• Even in back-propagation, the signal still flows through the connecting lines and the weights are
multiplied. The only difference is that the input and output signals flow in opposite directions.

18
Back-propagation algorithm

The output error starts from the output layer and moves backward
until it reaches the right next hidden layer to the input layer.
19
Back-propagation algorithm
• Neural network that consists of two
nodes for the input and output and a
hidden layer, which has two nodes.

Weighted sum of the hidden node is Weighted sum of the output nodes is

Output from the hidden nodes


Output from the neural network

20
Back-propagation algorithm
Train the neural network using the back-propagation algorithm

In the back-propagation algorithm, the delta


of the output node is defined identically to
the delta rule of the “Generalized Delta Rule”.

φ’(.) is the derivative of the activation function of the output node


yi is the output from the output node
di is the correct output from the training data
vi is the weighted sum of the corresponding node
Back-propagation algorithm
Train the neural network using the back-propagation algorithm
In the back-propagation
algorithm, the error of the
node is defined as the
weighted sum of the back-
propagated deltas from the
layer on the immediate right
(in this case, the output layer).
The error of the hidden node
is calculated as the backward
weighted sum of the delta,
Proceed leftward to the hidden nodes and calculate the delta and the delta of the node is
the product of the error and
the derivative of the
activation function. This
process begins at the output
(𝟏) (𝟏)
𝒗𝟏 and 𝒗𝟐 are the weight sums of the forward signals at the respective nodes. layer and repeats for all
22
hidden layers.
Back-propagation algorithm
Error calculation

To adjust the weights of the respective layers

xj is the input signal for the corresponding weight.

23
Back-propagation algorithm

24
Back propagation algorithm
1. Initialize the weights with adequate values.
2. Enter the input from the training data { input, correct output } and obtain the neural network’s output.
Calculate the error of the output to the correct output and the delta, δ, of the output nodes.

3. Propagate the output node delta, δ, backward, and calculate the deltas of the immediate next (left) nodes.

4. Repeat Step 3 until it reaches the hidden layer that is on the immediate right of the input layer.
5. Adjust the weights according to the following learning rule.

6. Repeat Steps 2-5 for every training data point.


7. Repeat Steps 2-6 until the neural network is properly trained.

25

You might also like