0% found this document useful (0 votes)

126 views11 pages

Activation Functions in Neural Networks: What Is Activation Function?

Activation functions are used in neural networks to determine the output of a node. There are two main types: linear and non-linear. Non-linear functions like sigmoid, tanh, and ReLU are most commonly used as they allow the network to learn complex patterns in data. Sigmoid and tanh squash outputs between 0-1 and -1-1, while ReLU is now most widely used as it trains faster but can cause dying ReLU issues if inputs become zero. Leaky ReLU was developed to address this by introducing a small slope for negative inputs. Derivatives of activation functions are important for backpropagation to update weights in the correct direction during training.

Uploaded by

Navneet Lalwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

126 views11 pages

Activation Functions in Neural Networks: What Is Activation Function?

Uploaded by

Navneet Lalwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Activation Functions in Neural Networks

What is Activation Function?

It’s just a thing function that you use to get the output of node. It
is also known as Transfer Function.

Why we use Activation functions with Neural Networks?

It is used to determine the output of neural network like yes or

no. It maps the resulting values in between 0 to 1 or -1 to 1 etc.
(depending upon the function).

The Activation Functions can be basically divided into 2 types-

1. Linear Activation Function

2. Non-linear Activation Functions

FYI: The Cheat sheet is given below.

Linear or Identity Activation Function

As you can see the function is a line or linear. Therefore, the output of
the functions will not be confined between any range.
Fig: Linear Activation Function

Equation : f(x) = x

Range : (-infinity to infinity)

It doesn’t help with the complexity or various parameters of usual data

that is fed to the neural networks.
Non-linear Activation Function

The Nonlinear Activation Functions are the most used activation

functions. Nonlinearity helps to makes the graph look something like
this

Fig: Non-linear Activation Function

It makes it easy for the model to generalize or adapt with variety of

data and to differentiate between the output.

The main terminologies needed to understand for nonlinear functions

are:

Derivative or Differential: Change in y-axis w.r.t. change in

x-axis.It is also known as slope.
Monotonic function: A function which is either entirely non-
increasing or non-decreasing.

The Nonlinear Activation Functions are mainly divided on the basis of

their range or curves-

1. Sigmoid or Logistic Activation Function

The Sigmoid Function curve looks like a S-shape.

Fig: Sigmoid Function

The main reason why we use sigmoid function is because it exists

between (0 to 1). Therefore, it is especially used for models where we
have to predict the probability as an output.Since probability of
anything exists only between the range of 0 and 1, sigmoid is the right
choice.
The function is differentiable.That means, we can find the slope of
the sigmoid curve at any two points.

The function is monotonic but function’s derivative is not.

The logistic sigmoid function can cause a neural network to get stuck at
the training time.

The softmax function is a more generalized logistic activation

function which is used for multiclass classification.

2. Tanh or hyperbolic tangent Activation Function

tanh is also like logistic sigmoid but better. The range of the tanh
function is from (-1 to 1). tanh is also sigmoidal (s - shaped).
Fig: tanh v/s Logistic Sigmoid

The advantage is that the negative inputs will be mapped strongly

negative and the zero inputs will be mapped near zero in the tanh
graph.

The function is differentiable.

The function is monotonic while its derivative is not monotonic.

The tanh function is mainly used classification between two classes.

Both tanh and logistic sigmoid activation functions are used in
feed-forward nets.

3. ReLU (Rectified Linear Unit) Activation Function

The ReLU is the most used activation function in the world right
now.Since, it is used in almost all the convolutional neural networks or
deep learning.

Fig: ReLU v/s Logistic Sigmoid

As you can see, the ReLU is half rectified (from bottom). f(z) is zero
when z is less than zero and f(z) is equal to z when z is above or equal
to zero.

Range: [ 0 to infinity)

The function and its derivative both are monotonic.

But the issue is that all the negative values become zero immediately
which decreases the ability of the model to fit or train from the data
properly. That means any negative input given to the ReLU activation
function turns the value into zero immediately in the graph, which in
turns affects the resulting graph by not mapping the negative values
appropriately.

4. Leaky ReLU

It is an attempt to solve the dying ReLU problem

Fig : ReLU v/s Leaky ReLU

Can you see the Leak? 😆

The leak helps to increase the range of the ReLU function. Usually, the
value of a is 0.01 or so.

When a is not 0.01 then it is called Randomized ReLU.

Therefore the range of the Leaky ReLU is (-infinity to infinity).

Both Leaky and Randomized ReLU functions are monotonic in nature.

Also, their derivatives also monotonic in nature.

Why derivative/differentiation is used ?

When updating the curve, to know in which

direction and how much to change or update the
curve depending upon the slope.That is why we
use differentiation in almost every part of Machine
Learning and Deep Learning.
Fig: Activation Function Cheetsheet
Fig: Derivative of Activation Functions

ANN Quiz
67% (3)
ANN Quiz
6 pages
Computer Science Extended Essay
No ratings yet
Computer Science Extended Essay
15 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
10 pages
Activation Functions
No ratings yet
Activation Functions
6 pages
Activation Functions
No ratings yet
Activation Functions
3 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
Activation
No ratings yet
Activation
7 pages
Activation Function
No ratings yet
Activation Function
31 pages
UNIT-III Activation-function
No ratings yet
UNIT-III Activation-function
6 pages
5 TH
No ratings yet
5 TH
22 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
Act_Fun
No ratings yet
Act_Fun
7 pages
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Activation Function
No ratings yet
Activation Function
13 pages
Types of Neural Network Activation Functions_ How to Choose_ (1)
No ratings yet
Types of Neural Network Activation Functions_ How to Choose_ (1)
36 pages
Activation Function
No ratings yet
Activation Function
9 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Activation Function
No ratings yet
Activation Function
36 pages
activatn fn 2
No ratings yet
activatn fn 2
10 pages
SoftComp 02
No ratings yet
SoftComp 02
33 pages
Activation Function
No ratings yet
Activation Function
43 pages
M2 PPT
No ratings yet
M2 PPT
84 pages
Deep Learning
No ratings yet
Deep Learning
10 pages
DL Answers
No ratings yet
DL Answers
24 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
Lect 5- Non Linear Activation Functions
No ratings yet
Lect 5- Non Linear Activation Functions
41 pages
Lec08-1Activation Functions
No ratings yet
Lec08-1Activation Functions
19 pages
3-Activation Function, Loss Function-24-07-2024
No ratings yet
3-Activation Function, Loss Function-24-07-2024
19 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Pr1_ANN_Writeup.docx
No ratings yet
Pr1_ANN_Writeup.docx
7 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
Experiment No. 1 SL-II (ANN)
No ratings yet
Experiment No. 1 SL-II (ANN)
3 pages
Activation Function
No ratings yet
Activation Function
4 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
Module1
No ratings yet
Module1
124 pages
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages
Activation Function
No ratings yet
Activation Function
18 pages
Activation Function: Deep Neural Networks
No ratings yet
Activation Function: Deep Neural Networks
47 pages
Study of Ensemble of Activation Functions in Deep Learning
No ratings yet
Study of Ensemble of Activation Functions in Deep Learning
10 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
Fundamentals Deep Learning Activation Functions When To Use Them
No ratings yet
Fundamentals Deep Learning Activation Functions When To Use Them
15 pages
2. Activation Functions - Sigmoid- Tanh- ReLU- Softmax- Risk Minimization- Loss Function
No ratings yet
2. Activation Functions - Sigmoid- Tanh- ReLU- Softmax- Risk Minimization- Loss Function
17 pages
How To Choose An Activation Function For Deep Learning
No ratings yet
How To Choose An Activation Function For Deep Learning
15 pages
Ann
No ratings yet
Ann
40 pages
4-Neural Networks and Activation Function
No ratings yet
4-Neural Networks and Activation Function
28 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
3. Activation Function
No ratings yet
3. Activation Function
14 pages
Activation Functions
No ratings yet
Activation Functions
11 pages
Unit 5 Activation Function
No ratings yet
Unit 5 Activation Function
15 pages
Feed Forward NN
No ratings yet
Feed Forward NN
35 pages
lecture 9-NN- modified
No ratings yet
lecture 9-NN- modified
94 pages
Introduction to Logarithms and Exponentials
From Everand
Introduction to Logarithms and Exponentials
Simone Malacrida
No ratings yet
FRANK_144_9330
No ratings yet
FRANK_144_9330
2 pages
WV47RS_1729818436814
No ratings yet
WV47RS_1729818436814
1 page
Why Convexity Is The Key To Optimization: Convex Sets
No ratings yet
Why Convexity Is The Key To Optimization: Convex Sets
4 pages
Artificial Neural Networks For Machine Learning - Every Aspect You Need To Know About
No ratings yet
Artificial Neural Networks For Machine Learning - Every Aspect You Need To Know About
9 pages
What Is Distribution?
No ratings yet
What Is Distribution?
4 pages
Limitations of ML
No ratings yet
Limitations of ML
10 pages
CS725 2020 Midsem
No ratings yet
CS725 2020 Midsem
3 pages
Flowchart Rubrics
No ratings yet
Flowchart Rubrics
1 page
Homework I: Sampling: R. Nassif, ECE Department, AUB EECE 340, Signals and Systems
No ratings yet
Homework I: Sampling: R. Nassif, ECE Department, AUB EECE 340, Signals and Systems
2 pages
1 s2.0 S0208521622000742 Main
No ratings yet
1 s2.0 S0208521622000742 Main
11 pages
Statistics Quantiative Analysis
No ratings yet
Statistics Quantiative Analysis
8 pages
ML Question Papper
100% (1)
ML Question Papper
2 pages
Chapter 4 Interpolation
No ratings yet
Chapter 4 Interpolation
14 pages
DAA Unit-1
No ratings yet
DAA Unit-1
48 pages
Wavelets 99
No ratings yet
Wavelets 99
12 pages
Lecture 05
No ratings yet
Lecture 05
56 pages
A Novel Approach To The Gas-Lift Allocation Optimization Problem
No ratings yet
A Novel Approach To The Gas-Lift Allocation Optimization Problem
11 pages
Signals & Systems 2019-20
No ratings yet
Signals & Systems 2019-20
2 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
22MAT31B - 20 QB
No ratings yet
22MAT31B - 20 QB
10 pages
Design and Analysis of Algorithms_R2013
No ratings yet
Design and Analysis of Algorithms_R2013
2 pages
Integer Programming ISE 418: Dr. Ted Ralphs
No ratings yet
Integer Programming ISE 418: Dr. Ted Ralphs
24 pages
1D Arrays CTSD Lab SEM IN 2 Solutions
No ratings yet
1D Arrays CTSD Lab SEM IN 2 Solutions
7 pages
Primal Dual
No ratings yet
Primal Dual
11 pages
Pub - Foundations-Of-Constraint-Satisfaction 278
No ratings yet
Pub - Foundations-Of-Constraint-Satisfaction 278
1 page
Second Order Low Pass Butterworth Filter Questions and Answers - Sanfoundry
No ratings yet
Second Order Low Pass Butterworth Filter Questions and Answers - Sanfoundry
3 pages
Theory of Equations Sheet-00
No ratings yet
Theory of Equations Sheet-00
3 pages
02 - Robust Tomography and Tomostaticswave-Equation Datuming Examples
No ratings yet
02 - Robust Tomography and Tomostaticswave-Equation Datuming Examples
4 pages
A Simple Guide To Centroid Based Clustering (With Python Code)
No ratings yet
A Simple Guide To Centroid Based Clustering (With Python Code)
25 pages
Nichols Chart
No ratings yet
Nichols Chart
9 pages
BigData Mining and Analytics
No ratings yet
BigData Mining and Analytics
2 pages
Year - B.Sc. (Data Science) (NEP Pattern) First Year Semester-I Subject - BSCDS011 - Data Structure and Algorithm Using Python
No ratings yet
Year - B.Sc. (Data Science) (NEP Pattern) First Year Semester-I Subject - BSCDS011 - Data Structure and Algorithm Using Python
2 pages
Ec T65 QB
No ratings yet
Ec T65 QB
38 pages
Numerical Differentiation - Summary PDF
No ratings yet
Numerical Differentiation - Summary PDF
8 pages

Activation Functions in Neural Networks: What Is Activation Function?

Uploaded by

Activation Functions in Neural Networks: What Is Activation Function?

Uploaded by

Activation Functions in Neural Networks

What is Activation Function?

Why we use Activation functions with Neural Networks?

It is used to determine the output of neural network like yes or

The Activation Functions can be basically divided into 2 types-

1. Linear Activation Function

2. Non-linear Activation Functions

FYI: The Cheat sheet is given below.

Linear or Identity Activation Function

Range : (-infinity to infinity)

It doesn’t help with the complexity or various parameters of usual data

The Nonlinear Activation Functions are the most used activation

Fig: Non-linear Activation Function

It makes it easy for the model to generalize or adapt with variety of

The main terminologies needed to understand for nonlinear functions

Derivative or Differential: Change in y-axis w.r.t. change in

The Nonlinear Activation Functions are mainly divided on the basis of

1. Sigmoid or Logistic Activation Function

The Sigmoid Function curve looks like a S-shape.

Fig: Sigmoid Function

The main reason why we use sigmoid function is because it exists

The function is monotonic but function’s derivative is not.

The softmax function is a more generalized logistic activation

2. Tanh or hyperbolic tangent Activation Function

The advantage is that the negative inputs will be mapped strongly

The function is differentiable.

The function is monotonic while its derivative is not monotonic.

The tanh function is mainly used classification between two classes.

3. ReLU (Rectified Linear Unit) Activation Function

Fig: ReLU v/s Logistic Sigmoid

The function and its derivative both are monotonic.

It is an attempt to solve the dying ReLU problem

Fig : ReLU v/s Leaky ReLU

Can you see the Leak? 😆

When a is not 0.01 then it is called Randomized ReLU.

Both Leaky and Randomized ReLU functions are monotonic in nature.

Why derivative/differentiation is used ?

When updating the curve, to know in which

You might also like