0% found this document useful (0 votes)
3 views57 pages

Neural Networks: Associate Professor Department of Management Studies

Uploaded by

santosh691823
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views57 pages

Neural Networks: Associate Professor Department of Management Studies

Uploaded by

santosh691823
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Neural Networks

Dr. A. Ramesh
Associate Professor
Department of Management Studies
Agenda

• Introduction to Neural Networks


• Advantage of using Neural Networks
• Applications of Neural Networks
• Brief Recap of Linear and Logistic Regression
• Elements of a Neural Network – Layers, Weights ,Activation functions
• Backpropagation and Gradient Descent
• Conclusion

Neural Networks 2
Introduction

• Neural networks, also known as artificial neural networks (ANNs) are a


subset of machine learning and are at the heart of deep learning
algorithms.
• Their name and structure are inspired by the human brain, mimicking
the way that biological neurons signal to one another.
• It creates an adaptive system that computers use to learn from their
mistakes and improve continuously

Neural Networks 3
Advantages of Neural Networks

• Neural networks can help computers make intelligent decisions with


limited human assistance.
• It can learn and model the relationships between input and output data
that are nonlinear and complex
• Neural Networks can be effectively used for regression as well as
classification problems

Neural Networks 4
Applications of Neural Networks
1. Computer Vision
• The ability to extract information and
insights from images and videos.
• With neural networks, computers can
distinguish and recognize images
Examples:
1. Self-driving cars to recognize road signs
2. Facial recognition
3. Image labeling to identify brand logos,
clothing, safety gear, etc.

Neural Networks 5
Applications of Neural Networks

2. Speech Recognition
• Neural networks can analyze human speech despite varying
speech patterns, pitch, tone, language, and accent.
• Virtual assistants like Amazon Alexa rely on speech
recognition techniques.
Examples:
1. Assist call center agents and automatically classify calls
2. Convert clinical conversations into documentation in real
time
3. Accurately subtitle videos and meeting recordings for
wider content reach

Neural Networks 6
Applications of Neural Networks

3. Predictive Maintenance
• Predictive maintenance is a growing application of
ANNs for improving equipment reliability and
reducing downtime.
• ANNs are used to analyze data from equipment
sensors to identify patterns and anomalies that
indicate when equipment is likely to fail.
• This helps companies to reduce maintenance costs

Neural Networks 7
Conclusion

In this lecture, we covered:


• Introduction to Neural Networks
• Advantages of Neural Networks
• Real world applications of Neural Networks

Neural Networks 8
Recap to Linear & Logistic Regression

Dr. A. Ramesh
Associate Professor
Department of Management Studies
Agenda

• Brief recap to linear and logistic regression


• Error functions in linear & logistic regression
• Concept of Gradient Descent

Neural Networks 10
Brief Recap

• Before studying Neural Networks, let us quickly recap Linear and Logistic
regression
• This will help us to easily understand the workings of Neural Networks

Neural Networks 11
Recap – Linear Regression

• Linear regression is one of the easiest and


most popular Machine Learning algorithms.
• It is a statistical method that is used for
predictive analysis.
• Linear regression makes predictions for
continuous/real or numeric variables such as
sales, salary, age, product price, etc.

Neural Networks 12
Recap – Linear Regression

The linear regression equation is expressed as follows


= w0 + w1x1 + …. + wnxn
Here,
• = Predicted value for the dependent variable
• x1, x2, …. , xn = Independent Variables
• w0= intercept of the line
• w1, w2, …, wn = Regression coefficients

Neural Networks 13
Recap – Linear Regression

• Our aim is to determine w0, w1, …. , wn such that the Mean Squared Error
is minimized.

Neural Networks 14
Recap – Logistic Regression

• It is often used for classification and predictive analytics.


• Logistic regression estimates the probability of an event occurring, such
as voted or didn’t vote, based on a given dataset of independent
variables.
• Since the outcome is a probability, the dependent variable is bounded
between 0 and 1.
• In logistic regression, the independent variable is categorical in nature.

Neural Networks 15
Recap – Logistic Regression

• We commonly use the Sigmoid function in logistic regression


• It is used to map the predicted values to probabilities.

Neural Networks 16
Recap – Logistic Regression

• The equation for logistic regression is as follows:

The error function used for Logistic Regression is known as log-loss function.

Neural Networks 17
Gradient Descent

• Gradient descent is commonly used for


determining weights (coefficients) of
regression models.
• The error functions discussed earlier are
convex (bowl-shaped) in nature.
• Thus, we can numerically determine the value
of weights that result in minimum error
function.

Neural Networks 18
Gradient Descent

• In gradient descent, we start with a random


point.
• From the chosen random point, we move in
the direction of downward slope.
• We continuously update the weights till we
reach the minimum point.
• The step size that we take to move
downwards is called as learning rate (denoted
by α )

Neural Networks 19
Conclusion

In this lecture, we covered


• A recap of linear and logistic regression techniques
• Error functions
• Intuition behind Gradient Descent

In the next lecture, we will continue our discussion on Neural Networks

Neural Networks 20
Elements of Neural Network

Dr. A. Ramesh
Associate Professor
Department of Management Studies
Agenda

An introduction to terminologies for Neural Network:


1. Layers
2. Neurons
3. Weights and Bias
4. Activation Functions

Neural Networks 22
Elements of Neural Networks - Layers

• A Neural Network consists of one input layer, one output layer and one
or more hidden layers.

Neural Networks 23
Elements of Neural Networks - Layers

• Each layer is made up of several nodes or neurons.


• The number of neurons in the input layer is equal to the number of input
variables
• Similarly, the number of neurons in the output layer is equal to the
number of output variables.
• Generally, a Neural Network comprising of more than 3 hidden layers is
called a Deep Neural Network.

Neural Networks 24
Elements of Neural Networks - Layers

• The data scientist defines the number of hidden layers and neurons
present in them for the NN based on expertise and model performance
• It is recommended to try different number of hidden layers and select
the model with highest performance.
• You should note that using very high number of hidden layers may lead
to overfitting, which should be avoided.

Neural Networks 25
What happens inside a Neural Network?

• Consider the following Neural


Network
 We have 3 inputs (X1, X2 and
X3) for our input layer X1

 2 hidden layers
X2
 1 output variable

X3

Neural Networks 26
What happens inside a Neural Network?

• X1, X2 and X3 are provided as input to


every neuron of the 1st hidden layer
• The output of 1st hidden layer is input
for the 2nd hidden layer
• The output of the 2nd hidden layer is
provided as input for the Output layer

Neural Networks 27
Hidden Layer Neurons

• Every hidden layer neuron is associated with a set of weights and a bias
• The output layer also has its own set of weights and bias
w11, w12, w13, b1 w’11, w’12, w’13, b’1

w21, w22, w23, b2 w’21, w’22, w’23, b’2

W''11, W''12, W''13,...., B

w31, w32, w33, b3 w’31, w’32, w’33, b’3

w41, w42, w43, b4 w’41, w’42, w’43, b’4

Neural Networks 28
Hidden Layer Neurons
• For every iteration, a hidden neuron performs two tasks:
1. Weighted summation of the inputs (z = w11x1 + w12x2 + …. w1nxn)
2. Applying the activation function (σ) to the weighted sum ‘z’.

This value σ(z) becomes the input for the next hidden layer neurons
Hidden Layer Neuron
X1
σ(z)

X2 σ(z)
Z = wx + b σ(z)

σ(z)

X3
Neural Networks 29
Hidden Layer Neurons

• This process is done repeatedly during the training process


• It can be seen that the process is very similar to conventional regression
methods discussed earlier.

X1
σ(z)

X2 σ(z)
Z = wx + b σ(z)

σ(z)

X3
Neural Networks 30
Conclusion

• In this lecture, we studied the following:


• Input layers, hidden layers, output layers
• Weights and bias for neurons
• Calculations taking place inside a hidden neuron

Neural Networks 31
Activation functions

Dr. A. Ramesh
Associate Professor
Department of Management Studies
Agenda

• What are activation functions?


• Different types of activation functions
• Error function for Neural Network

Neural Networks 33
Activation functions

• Activation function is used to transform the input non-linearly


• It allows the Neural Network to learn complex relationships between
variables which is not possible through conventional techniques
• The non-linearity of Neural Network results from the use of Activation
function.

Neural Networks 34
Activation functions

• Commonly used Activation functions are:


1. Sigmoid function
2. Rectified Linear Unit
3. Tanh function

Neural Networks 35
Activation functions

Sigmoid function
• Range : (0 , 1)

Neural Networks 36
Activation functions

Rectified Linear Unit (also called


ReLU):
• Range: [0, inf]
• Usually preferred over other
activation functions

Neural Networks 37
Activation functions

Tanh function
Range: (-1, 1)

Neural Networks 38
Activation functions

• A neural network will almost always have the same activation function in
all hidden layers.
• The activation function for output layer is usually different than the
activation function for hidden layers.

Neural Networks 39
Activation functions for Output Layer

• The commonly used activation functions for output layer are:


1. Linear activation
2. Sigmoid function
3. Softmax function

Neural Networks 40
Activation function for Output Layer

• Linear activation for output layer is used when you


are working with a regression problem σ1(z)

• Linear activation function is represented by y=x


σ2(z)
• The outputs of the last hidden layer are directly Ypred

used for weighted sum to determine the output of σ3(z)


the neural network. W''11, W''12, W''13,...., B

• Ypred = W1.σ1(z) + W2.σ2(z) + W3.σ3(z) + W4.σ4(z) + B σ4(z)

Neural Networks 41
Activation functions for Output Layer

• Sigmoid activation for output layer is used mainly for


Binary Classification problems. σ1(z)

• Due to the sigmoid activation function, the output of


σ2(z)
the Neural Network .i.e. Ypred will be between 0 and 1 Ypred

σ3(z)
• Ypred = Sigmoid(W1.σ1(z) + W2.σ2(z) + W3.σ3(z) + Sigmoid activation
W''11, W''12, W''13,...., B
W4.σ4(z) + B)
σ4(z)

Neural Networks 42
Activation functions for Output Layer

• Similarly, for multi-class classification problems, we


use the Softmax function

• The softmax function outputs a vector of values


that sum to 1.0 that can be interpreted as
probabilities of class membership.

• Ypred = Softmax(W1.σ (z) + W2.σ (z) + W3.σ (z)


1 2 3

+ W4.σ (z) + B)
4

Neural Networks 43
Error function for Neural Networks

• The error function for Neural Network also depends on the type of
problem.
• As discussed earlier, the Mean Squared Error is used for regression
problems
• The Log-loss function is used for Classification problems
• The aim is to determine the weights and bias values for all hidden layers
and the output layer such that our Error function is minimized.

Neural Networks 44
Gradient Descent

• Initially, all weights and bias values are chosen randomly as discussed
earlier
• We use gradient descent to determine the optimal values for weights
and biases of the Neural Network that minimize the error function.

Neural Networks 45
Conclusion

In this lecture, we covered


• Brief discussion on activation functions
• Different activation functions for hidden and output layers
• Error function for Neural Networks

Neural Networks 46
Numerical Example for Neural Network

Dr. A. Ramesh
Associate Professor
Department of Management Studies
Agenda

• Solve a simple example for Neural Network


• Understand the calculations and working of Neural Network in detail

Neural Networks 48
Example for Neural Networks

• Let us consider a simple example to understand the calculations that


take place in a Neural Network.
• We consider two input variables X1 and X2.
• Y is the output variable, the variable that we wish to predict.
• The data is shown in the table given below:

X1 X2 Y

5 5 10

Neural Networks 49
Example for Neural Networks

• We consider the following Neural Network for this example.


• We have two hidden layers containing one neuron.
• The activation function for hidden layers is Sigmoid function
• The activation function for output layer is Linear function.

X1 Sigmoid activation Sigmoid activation Linear activation

X2

Neural Networks 50
Calculation for Hidden Layer - 1
Input Data
X1 X2 Y
• Let the weights and bias for Hidden Layer Neuron be:
w1= 0, w2 = 0.5, b= 1 5 5 10

Thus, the weighted summation will be:


Z = w1x1 + w2x2 + b
= 0 x 5 + 0.5 x 5 + 1
Z1 = 3.5
The output from this hidden layer will be Sigmoid(z):
Sigmoid(z) = 1 / (1 + e-z ) = 1/ (1+ e-3.5) = 1/ 1.0302
Y1 = Sigmoid(z) = 0.97 Y1

Neural Networks 51
Calculation for Hidden Layer - 2
• Let the weights and bias for Hidden Layer-2 Neuron be:
w1= 1, b= 2
We do not have w2 in this case since there is only one input (Y1) for Hidden Layer -2
Thus, the weighted summation will be:
Z = w1Y1 + b
= 1 x 0.97 + 2
Z2 = 2.97
The output from 2nd hidden layer will the Sigmoid(z):
Sigmoid(z) = 1 / (1 + e-z ) = 1/ (1+ e-2.97) = 1/ 1.0513
Y2 = Sigmoid(z) = 0.953 Y1 Y2

Neural Networks 52
Calculation for Output Layer
• Let the weights and bias for Output Layer Neuron be:
W1= 10, B= 10
Thus, the weighted summation will be:
Z3 = W1 Y2 + b
= 10 x 0.953 + 10
Z3 = 19.53

Y1 Y2

Neural Networks 53
Example for Neural Networks

X1 = 5

X2 = 5
Y_pred = 19.53
Y1 = 0.97 Y2 = 0.9523

Z = 19.53

• The Z value for output layer will be passed through the Linear activation function (y=x)
• Thus, the output of the output layer .i.e. Ypred = Y2 = 19.53

Neural Networks 54
Example for Neural Networks

• Now that we have our Ypred value, the error can be calculated
• Since this is a regression problem, the error function will be Mean
Squared Error
X1 X2 Y
• Ypred = 19.53 ; Yactual = 10
• The MSE will be: 5 5 10

MSE = (19.53 – 10)2 = (9.53)2


MSE = 90.82

We perform gradient descent to update our weights until MSE is minimized.

Neural Networks 55
Conclusion

• In this lecture, we provided an introduction to Neural Networks and its


applications
• We also gave a brief recap of linear and logistic regression techniques
• We studied the various elements of a Neural Network
• We also understood the various calculations that take place within a
Neural Network and different types of Activation functions
• Finally, we discussed these steps with the help of an example.
• In the next lecture, we shall understand how to implement a Neural
Network using Python.

Neural Networks 56
57

You might also like