Lesson 3 Artificial Neural Network
Lesson 3 Artificial Neural Network
TensorFlow
Artificial Neural Network
Learning Objectives
Myelin sheath
Axon
Output Signals
Cell nucleus
Input Signals
▪ Neurons are interconnected nerve cells that build the nervous system and transmit information throughout the
body.
▪ Dendrites are extension of a nerve cell that receive impulses from other neurons.
▪ Cell nucleus stores cell’s hereditary material and coordinates cell’s activities.
▪ Axon is a nerve fiber that is used by neurons to transmit impulses.
▪ Synapse is the connection between two nerve cells.
Rise of Artificial Neurons
Biological Neuron
Artificial Neuron
▪ Researchers Warren McCullock and Walter Pitts published their first concept of simplified brain
cell in 1943.
▪ Nerve cell was considered similar to a simple logic gate with binary outputs.
▪ Dendrites can be assumed to process the input signal with a certain threshold such that if the
signal exceeds the threshold, the output signal is generated.
Definition of Artificial Neuron
Dendrites Input
Axon Output
Neural Networks
Perceptron
1
W0
X1
W1
∈ Output
X2 W2
Wn
Xm
Perceptron: The Main Processing Unit
S f (S) Output
𝑊3
Summation
Activation
Function
Function
While the weights determine the slope of the equation, bias shifts the output line towards left or right.
S f (S) Output
𝑊3
Summation Activation
Function Function
𝑋𝑛
Activation Functions
A perceptron can learn anything that it can represent, i.e., anything separable with a hyperplane.
However, it cannot represent Exclusive OR since it is not linearly separable.
x1
-1 1
x2
-1
Multilayer Perceptrons
Ψ(a)
1 The most common output
function(Sigmoid)
a
Problem Scenario: You are hired by one of the major AI giants, planning to build the best image
classifier model available till date. In the first phase of model development, the input is passed
from the MNIST dataset. MNIST dataset is one of the most common datasets used for image
classification and is accessible from many different sources. It is a subset of a larger set available
from NIST and contains 60,000 training images and 10,000 testing images on handwritten digits
taken from American Census Bureau employees and American high school students.
Objective:
Build a perceptron-based classification model to:
▪ Classify the handwritten digits properly
▪ Make predictions
▪ Evaluate model efficiency
Access: Click the Practice Labs tab on the left panel. Now, click on the START LAB button and wait
while the lab prepares itself. Then, click on the LAUNCH LAB button. A full-fledged jupyter lab
opens, which you can use for your hands-on practice and projects.
Backpropagation
Learning Networks
Let 𝑊𝑖 be the initial Let X be the input 𝑊𝑖 (t+1) = 𝑊𝑖 (t) +n(d- The former steps are
weight and Y be the output y)X, where d is the iterated continuously
desired output and y by changing the
is the actual output values of n till a
considerable output
is obtained
The Error Landscape
Error
Landscape
SSE:
Sum Squared
Error
S (ti – zi)2
Weight
Values
Deriving a Gradient Descent or Ascent Algorithm
The idea of the algorithm is to decrease overall error (or other objective function) each time a
weight is changed.
Take a step in that direction such that the step size is proportional to gradient.
function
value
x
Gradient Ascent: Step 02
function
value
x
Gradient Ascent: Step 03
Repeat
function
value
x
Gradient Ascent: Step 04
function
value
x
Gradient Ascent: Step 05
function
value
x
Gradient Ascent: Step 06
function
value
x
The Learning Rate
▪ Learning rate is used to control the changes made to the weights and biases in order to reduce
the error.
▪ It is used to analyze how an error will change when the values of weights and biases are
changed by a unit.
.15 w1 .40 w5
i1 h1 o1
.45 w6
.05 .20 w2 .01
.25 w3 .50 w7
i2 h2 o2
.30 w4 .55 w8
.10 .99
1 1 b2 .60
b1 .35
Forward Pass
Update the weight. To decrease the error, we then subtract this value from the current weight.
Weight Updation
.15 w1 .40 w5
i1 h1 o1
.45 w6
.05 .20 w2 .01
.25 w3 .50 w7
i2 h2 o2
.30 w4 .55 w8
.10 .99
1 1 b2 .60
b1 .35
Note: While using the backpropagation mechanism, we use the original weights, not the
updated weights.
Hidden Layer Weight Assignment
w1
i1 net out
h1
i2 h2
Hidden Layer Weight Assignment
Hidden Layer Weight Assignment
Hidden Layer Weight Assignment
Hidden Layer Weight Assignment
Nonlinearities are needed to learn complex (non-linear) representations of data, otherwise the
neural networks would be just a linear function
Activation Functions
Sigmoid
Tanh
ReLU
▪ Takes a real-valued number and squashes it into a
range of 0 to 1
▪ Sigmoid neurons saturate and kill gradients, thus
NN will barely learn
Activation Functions
Sigmoid
Tanh
ReLU
▪ Takes a real-valued number and squashes it into a
range of -1 to 1
▪ Like sigmoid, tanh neurons saturate
▪ Unlike sigmoid, output is zero-centered
Activation Functions
Sigmoid
Tanh
Problem Scenario: The backpropagation algorithm plays a key role in training a feedforward
artificial neural network. It models a given function by modifying internal weights of input neurons
to produce an expected output neuron.
Objective:
Build a neural network which takes tanh as the activation function and updates weights with
respect to the tanh gradients.
Access: Click the Practice Labs tab on the left panel. Now, click on the START LAB button and wait
while the lab prepares itself. Then, click on the LAUNCH LAB button. A full-fledged jupyter lab
opens, which you can use for your hands-on practice and projects.
Activation Function
Problem Scenario: Neural networks are the crux of deep learning, a field which has practical
applications in many different areas. They become more accurate and effective with multiple
layers. The creation of multilayered neural network is not feasible. However, developing the
source code for a shallow neural network will help understand the functioning of deep neural
networks much better.
Objective: Write a simple neural network in Python considering the activation function as
sigmoid.
Access: Click the Practice Labs tab on the left panel. Now, click on the START LAB button and wait
while the lab prepares itself. Then, click on the LAUNCH LAB button. A full-fledged jupyter lab
opens, which you can use for your hands-on practice and projects.
Defining Elements
Import the necessary libraries and define a class in the name of NeuralNetwork
Initialize Weights and Assign Activation Functions
Weight Adjustment
Note: The above functions are defined within the class named NeuralNetworks.
Weight Adjustment
Note: The above functions are defined within the class named NeuralNetworks.
Initialize the think Function
Initialize the Neural Network
Train the Neural Network
Regularization
Deep Neural Networks
When a neural network contains more than one hidden layer it becomes a Deep Neural Network.
……
……
……
Training data
Validation data
error
Starting to
overfit
Early stopping
algorithm
Epochs
Learned hypothesis may fit the training data and the outliers (noise) very well but fail to
generalize test data.
Dealing with the Overfitting Problem
Dropout Regularization
L2 Regularization
Dropout Regularization
L2 Regularization
L2 Regularization
Dropout
Regularization
Dropout Layers
Dropout Experiment
An architecture of 784-2048-2048-2048-10 is used on the MNIST dataset. The dropout rate 𝑝 was
changed from small numbers (most units are dropped out) to 1.0 (no dropout).
d. It is impossible
Knowledge
Check After the perceptron algorithm finishes training, how can the learned weights be
expressed in terms of the initial weight vector and the input vectors?
1
d. It is impossible
During the perceptron training algorithm, the weights are updated by adding or subtracting the input vector. Moreover, this might happen
multiple times for the same input vector. Therefore, the weight vector = initial vector + c1 * data point 1 + c2 * data point 2 + … cn * data
point n, where c_i = number of times data point i was added - number of times data point i was subtracted
Knowledge
Check Which of the following techniques performs similar operations as dropout in a neural
network?
2
a. Bagging
b. Boosting
c. Stacking
d. None of these
Knowledge
Check Which of the following techniques performs similar operations as dropout in a neural
network?
2
a. Bagging
b. Boosting
c. Stacking
d. None of these
Dropout can be seen as an extreme form of bagging in which each model is trained on a single case and each
parameter of the model is very strongly regularized by sharing it with the corresponding parameter in all the other
models.
MNIST Image Classification
Problem Scenario: The MNIST dataset is widely used for image classification.
However, while validating the same, researchers found out that the
classification model was overfitting as it was not giving acceptable accuracy on
the testing data.
Use the mnist_test.csv and mnist_train.csv for model optimization (using
dropout layers). Also, you will have to use one hot encoding for training and
testing labels.
Objective:
Optimize a neural network based classification model using dropout
regularization such that the p value is 0.70 for input and hidden layers.
Access: Click the Practice Labs tab on the left panel. Now, click on the START
LAB button and wait while the lab prepares itself. Then, click on the LAUNCH
LAB button. A full-fledged jupyter lab opens, which you can use for your
hands-on practice and projects.
Thank You