Lecture 15

The document discusses activation functions and backpropagation algorithms in artificial neural networks. It defines key concepts like activation functions, types of activation functions including sigmoid, tanh, ReLU, and describes how non-linear activation functions allow neural networks to learn complex patterns. It then explains the backpropagation algorithm for training multi-layer perceptrons, including initializing weights, propagating inputs forward and errors backward to update weights, and conditions for terminating training.

Uploaded by

Abood Fazil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

Lecture 15

Uploaded by

Abood Fazil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

eMBA933

Data Mining
Tools & Techniques
Lecture 15.1

Dr. Faiz Hamid

Associate Professor
Department of IME
IIT Kanpur
[email protected]
Artificial Neural Networks
Role of Activation Function
• Mathematical “gate”

• With linear activation functions, no matter how

many layers in the neural network, the last layer will
be a linear function of the first layer
– A linear combination of linear functions is still a linear
function
– A linear activation function turns the neural network into
just one layer
– A neural network with a linear activation function is simply
a linear regression model
Role of Activation Function
• Non‐linear activation functions allow complex
mappings between the network’s inputs and outputs
– essential for learning and modeling complex data, such as
images, video, audio, and data sets which are non‐linear or
have high dimensionality

• Activation functions should be differentiable

– Derivative/ Gradient of the activation functions are used
for error calculations during back‐propagation algorithm to
improve and optimize the results
Types of Activation Functions
• Linear Function
– Takes the inputs, multiplied by the
weights for each neuron, and creates an
output signal proportional to the input
– Better than a step function; allows
multiple outputs, not just yes and no
– Not possible to use backpropagation
(gradient descent) to train the model
– Derivative of the function is a constant,
and has no relation to the input
Types of Activation Functions
• Nonlinear Activation Functions
– Allow backpropagation because they have a derivative function which is
related to the inputs
– Allow “stacking” of multiple layers of neurons to create a deep neural
network; learn complex data sets with high levels of accuracy
• Sigmoid / Logistic
– Smooth gradient, preventing “jumps”
in output values
– Clear predictions— For x above 2 or below
‐2, tends to bring the Y value (the prediction)
to the edge of the curve, very close to 1 or 0
– Output values between 0 and 1, normalizing the output of each neuron
– Usually used in output layer of a binary classification
– Vanishing gradient—for very high or very low values of X, there is almost
no change to the prediction; network refusing to learn further
– Computationally expensive
Types of Activation Functions
• TanH / Hyperbolic Tangent
– Scaled version of the sigmoid function
– Easier to model inputs that have strongly
negative, neutral, and strongly positive values
– vanishing gradient problem
• ReLU (REctified Linear Unit)
– Most widely used activation function
– Computationally efficient ‐ allows the network
to converge very quickly
– Non‐linear ‐ although it looks like a linear
function, ReLU has a derivative function and
allows for backpropagation
– Dying ReLU problem—when inputs approach
zero, or are negative, the gradient of the
function becomes zero
Types of Activation Functions
• Leaky ReLU
– An improved version of the ReLU function
– A small linear component of x for x < 0
– Removes the zero gradient
– Avoids dead neurons for x < 0
– Dead neuron = always produces same output,
not playing any role in discriminating the input
and is essentially useless
• Softplus
– A smooth approximation to the ReLU
Choosing the Right Activation Function
• Choose an activation function which will
approximate the function faster leading to faster
training process
• If output is binary classification, use sigmoid
function for output layer
• Linear activation function are used in the output
layer in case of regression problems
• Sigmoids and tanh functions are sometimes
avoided due to the vanishing gradient problem
• Tanh is avoided most of the time due to dead
neuron problem
• If a case of dead neurons occurs in the networks the leaky ReLU function
is the best choice
• If you really don’t know which function to use, simply use ReLU
• ReLU function should only be used in the hidden layers
Training Multilayer Perceptrons
• Backpropagation Algorithm
– Learns using a gradient descent method
– Minimize mean squared distance between the network’s class
prediction and the known target value
– Error is propagated backwards: update weights
Flow of Signal

Backpropagation of Error
Backpropagation Algorithm
• Neural network learning for classification or numeric
prediction, using the backpropagation algorithm

• Input
– D, a data set consisting of the training tuples and their associated
target values;
– l, the learning rate;
– network, a multilayer feed‐forward network

• Output: A trained neural network

Backpropagation Algorithm
• Initialize the weights
– weights and bias (thresholds) are initialized to small random
numbers (e.g., [‐1.0,1.0], [‐0.5 to 0.5])

• Propagate the inputs forward

– training tuple is fed to the network’s input layer
– inputs pass through the input units unchanged
– net input to a unit in the hidden or output layers is
computed as a linear combination of its inputs
Backpropagation Algorithm
• Each unit in the
hidden and output
layers takes its net
input and then applies
an activation function
to it

• Logistic / sigmoid
function is used
Backpropagation Algorithm
• Backpropagate the error
– Error is propagated backward by updating the weights and biases to
reflect the error of the network’s prediction
– Error Errj of a unit in output layer is computed by

– Oj is the actual output of unit j, and Tj is the known target value of the
given training tuple.
– Error of a hidden layer unit j is

– Weights and biases are updated as ( l being learning rate)

Backpropagation Algorithm
• Terminating condition
– Training stops when
• All in the previous epoch (iteration) are below
some specified threshold, or
• The percentage of tuples misclassified in the previous
epoch is below some specified threshold, or
• A prespecified number of epochs has expired
Backpropagation Algorithm
• Some comments
– Learning rate helps avoid getting stuck at a
local minimum
Help! I’m stuck
– If learning rate is too small, learning occurs at
a very slow pace
– If learning rate is too large, oscillation
between inadequate solutions may occur
– Time complexity of backpropagation is
O(n.m.hk.o.i)
• n training samples
• m features
• h neurons
• k hidden layers
• o output neurons
• i is the number of iterations
Backpropagation Algorithm

• Traditional default learning rate values are 0.1, 0.01, and 0.001
Backpropagation Algorithm
• Example. A multilayer feed‐forward neural network and initial
weight and bias values are given
– Training tuple X =(1, 0, 1), class label 1
– Learning rate = 0.9
– Sigmoid activation function
Backpropagation Algorithm

Error Errj of a unit in output layer is computed by

Error of a hidden layer unit j is

Backpropagation Algorithm
Representational Power
• Neural network with at least one hidden layer is a universal
approximator (can represent any function)
• Capacity of the network increases with more hidden units and
more hidden layers

Activation Funtions
No ratings yet
Activation Funtions
26 pages
Activation Function
No ratings yet
Activation Function
31 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
Activation Functions and Keras Metrics
No ratings yet
Activation Functions and Keras Metrics
31 pages
DL Answers
No ratings yet
DL Answers
24 pages
activatn fn 2
No ratings yet
activatn fn 2
10 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
12 Types of Neural Network Activation Functions
No ratings yet
12 Types of Neural Network Activation Functions
38 pages
Module1
No ratings yet
Module1
124 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
Activation Function
No ratings yet
Activation Function
44 pages
UNIT V NEURAL NETWORKS
No ratings yet
UNIT V NEURAL NETWORKS
35 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Ad3451 Ml Unit 4 Notes
No ratings yet
Ad3451 Ml Unit 4 Notes
34 pages
UNIT-III Activation-function
No ratings yet
UNIT-III Activation-function
6 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Activation Function
No ratings yet
Activation Function
4 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
lecture 9-NN- modified
No ratings yet
lecture 9-NN- modified
94 pages
SoftComp 02
No ratings yet
SoftComp 02
33 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Activation Function
No ratings yet
Activation Function
9 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
5 TH
No ratings yet
5 TH
22 pages
Fundamentals Deep Learning Activation Functions When To Use Them
No ratings yet
Fundamentals Deep Learning Activation Functions When To Use Them
15 pages
3-Activation Function, Loss Function-24-07-2024
No ratings yet
3-Activation Function, Loss Function-24-07-2024
19 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Deep Learning
No ratings yet
Deep Learning
10 pages
Activation
No ratings yet
Activation
7 pages
26- netinput activation function forward and back propogation
No ratings yet
26- netinput activation function forward and back propogation
41 pages
Activation Functions in Neural Networks [12 Types & Use Cases]
No ratings yet
Activation Functions in Neural Networks [12 Types & Use Cases]
43 pages
Aditya Jain NN Assignment
No ratings yet
Aditya Jain NN Assignment
13 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Act_Fun
No ratings yet
Act_Fun
7 pages
Nndl Umit 1 Important Questions
No ratings yet
Nndl Umit 1 Important Questions
8 pages
Forward_and_Backward_Propagation_Deep_Learning_1703697260
No ratings yet
Forward_and_Backward_Propagation_Deep_Learning_1703697260
9 pages
0905 Cs 161183 Vishal
No ratings yet
0905 Cs 161183 Vishal
38 pages
Activation Functions in Neural Networks: What Is Activation Function?
No ratings yet
Activation Functions in Neural Networks: What Is Activation Function?
11 pages
Unit 2_Activation Function_PR
No ratings yet
Unit 2_Activation Function_PR
22 pages
Lect 5- Non Linear Activation Functions
No ratings yet
Lect 5- Non Linear Activation Functions
41 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
NN Lec - 04 - 05
No ratings yet
NN Lec - 04 - 05
84 pages
LLM Ai Interview SS
No ratings yet
LLM Ai Interview SS
187 pages
Activation Function: Deep Neural Networks
No ratings yet
Activation Function: Deep Neural Networks
47 pages
Neural-Networks-Overview-365-Data-Science-Course Notes
No ratings yet
Neural-Networks-Overview-365-Data-Science-Course Notes
14 pages
JETIR2006041
No ratings yet
JETIR2006041
8 pages
How To Choose An Activation Function For Deep Learning
No ratings yet
How To Choose An Activation Function For Deep Learning
15 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Activation Function
No ratings yet
Activation Function
7 pages
Unit 5 Activation Function
No ratings yet
Unit 5 Activation Function
15 pages
Lect 5-6activation Function
No ratings yet
Lect 5-6activation Function
15 pages
4-Neural Networks and Activation Function
No ratings yet
4-Neural Networks and Activation Function
28 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
HairJr2021 Book PartialLeastSquaresStructuralE
100% (2)
HairJr2021 Book PartialLeastSquaresStructuralE
208 pages
GPMF English Brochure
No ratings yet
GPMF English Brochure
8 pages
Mathematical Review
No ratings yet
Mathematical Review
12 pages
01 Intro
No ratings yet
01 Intro
45 pages
02data InClass 20150827
No ratings yet
02data InClass 20150827
18 pages
03preprocessing 1
No ratings yet
03preprocessing 1
39 pages
OpTimIzation Overview
No ratings yet
OpTimIzation Overview
47 pages
Tree (Data Structure)
No ratings yet
Tree (Data Structure)
72 pages
Design & Analysis of Algorithm (CSC-321) : Computer Sciences Department Bahria University (Karachi Campus)
No ratings yet
Design & Analysis of Algorithm (CSC-321) : Computer Sciences Department Bahria University (Karachi Campus)
12 pages
InterviewCafe'250 DSA Sheet Tracker by Santosh Mishra
No ratings yet
InterviewCafe'250 DSA Sheet Tracker by Santosh Mishra
32 pages
Sample Linear Programming
No ratings yet
Sample Linear Programming
37 pages
BX4001 Record Program
No ratings yet
BX4001 Record Program
56 pages
Queue Laboratory Exercise PDF
No ratings yet
Queue Laboratory Exercise PDF
6 pages
ETCS 301 - Algorithms Design and Analysis - Unit - I
No ratings yet
ETCS 301 - Algorithms Design and Analysis - Unit - I
71 pages
Bin Packing Problem
No ratings yet
Bin Packing Problem
3 pages
Bresenham Circle Drawing Algorithm (By Udit Agarwal)
No ratings yet
Bresenham Circle Drawing Algorithm (By Udit Agarwal)
3 pages
hw1 Sol
No ratings yet
hw1 Sol
6 pages
How Hashmap Works in Java
No ratings yet
How Hashmap Works in Java
6 pages
String Matching Algorithm
No ratings yet
String Matching Algorithm
2 pages
Algorithm Questions
No ratings yet
Algorithm Questions
34 pages
Presantation - Chapter 06 - Brute Force and Exhaustive Search
No ratings yet
Presantation - Chapter 06 - Brute Force and Exhaustive Search
68 pages
Algorithm Design and Data Structures Questions
No ratings yet
Algorithm Design and Data Structures Questions
3 pages
Special Purpose
100% (2)
Special Purpose
100 pages
Question
No ratings yet
Question
14 pages
Modified Regular Falsi Method
No ratings yet
Modified Regular Falsi Method
5 pages
MCQ - DS - Unit-1 To 6
100% (1)
MCQ - DS - Unit-1 To 6
31 pages
Lecture 5 - Solving Non-Linear Equations (Open Methods , Secant)
No ratings yet
Lecture 5 - Solving Non-Linear Equations (Open Methods , Secant)
14 pages
Explanation of Simplex Method
No ratings yet
Explanation of Simplex Method
18 pages
Answer Keys Exam Data Structure
100% (1)
Answer Keys Exam Data Structure
9 pages
LU Decomposition For Solving Linear Equations
No ratings yet
LU Decomposition For Solving Linear Equations
8 pages
Mca 2 Sem Data Structures Analysis of Algorithms Kca205 2023
No ratings yet
Mca 2 Sem Data Structures Analysis of Algorithms Kca205 2023
3 pages
CYK-Algorithm Updated
No ratings yet
CYK-Algorithm Updated
33 pages
Chapter 4 - Queue 2019
No ratings yet
Chapter 4 - Queue 2019
25 pages
Analysis and Design of Algorithms
No ratings yet
Analysis and Design of Algorithms
15 pages
Sources of Approximation: Before Computation
No ratings yet
Sources of Approximation: Before Computation
31 pages
Algo Assignment-P2 Binary SRCH
No ratings yet
Algo Assignment-P2 Binary SRCH
7 pages

Lecture 15

Uploaded by

Lecture 15

Uploaded by

eMBA933

Dr. Faiz Hamid

• With linear activation functions, no matter how

• Activation functions should be differentiable

• Output: A trained neural network

• Propagate the inputs forward

– Weights and biases are updated as ( l being learning rate)

Error Errj of a unit in output layer is computed by

Error of a hidden layer unit j is

You might also like