0% found this document useful (0 votes)

18 views6 pages

UNIT-III Activation-Function

An activation function is a mathematical function that introduces non-linearity into neural networks, enabling them to learn complex patterns. Non-linear activation functions, such as Sigmoid, Tanh, and ReLU, are essential for effective learning and decision-making in deep learning models. The choice of activation function significantly impacts model performance, including convergence speed and gradient flow.

Uploaded by

sahilsharma747392

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views6 pages

UNIT-III Activation-Function

Uploaded by

sahilsharma747392

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

What is an Activation Function?

An activation function is a mathematical function applied to the output of a neuron. It

introduces non-linearity into the model, allowing the network to learn and represent complex
patterns in the data. Without this non-linearity feature, a neural network would behave like a
linear regression model, no matter how many layers it has.

The activation function decides whether a neuron should be activated by calculating the
weighted sum of inputs and adding a bias term. This helps the model make complex decisions
and predictions by introducing non-linearities to the output of each neuron.

Why is Non-Linearity Important in Neural Networks?

Neural networks consist of neurons that operate using weights, biases, and activation functions.

In the learning process, these weights and biases are updated based on the error produced at the
output—a process known as backpropagation. Activation functions enable backpropagation by
providing gradients that are essential for updating the weights and biases.

Without non-linearity, even deep networks would be limited to solving only simple, linearly
separable problems. Activation functions empower neural networks to model highly complex
data distributions and solve advanced deep learning tasks. Adding non-linear activation functions
introduce flexibility and enable the network to learn more complex and abstract patterns from
data.

Mathematical Proof of Need of Non-Linearity in Neural Networks

To illustrate the need for non-linearity in neural networks with a specific example, let’s consider
a network with two input nodes (i1and i2)(i1and i2), a single hidden layer containing one
neuron (h1)(h1), and an output neuron (out). We will use w1,w2w1,w2 as weights connecting the
inputs to the hidden neuron, and w5w5 as the weight connecting the hidden neuron to the output.
We’ll also include biases (b1b1 for the hidden neuron and b2b2 for the output neuron) to
complete the model.

Types of Activation Functions in Deep Learning

1. Linear Activation Function

Linear Activation Function resembles straight line define by y=x. No matter how many layers
the neural network contains, if they all use linear activation functions, the output is a linear
combination of the input.

• The range of the output spans from (−∞ to +∞) (−∞ to +∞).
• Linear activation function is used at just one place i.e. output layer.

• Using linear activation across all layers makes the network’s ability to learn complex
patterns limited.

Linear activation functions are useful for specific tasks but must be combined with non-linear
functions to enhance the neural network’s learning and predictive capabilities.

Non-Linear Activation Functions

1. Sigmoid Function

Sigmoid Activation Function is characterized by ‘S’ shape. It is mathematically defined

asA=11+e−xA=1+e−x1. This formula ensures a smooth and continuous output that is essential
for gradient-based optimization methods.

• It allows neural networks to handle and model complex patterns that linear equations
cannot.

• The output ranges between 0 and 1, hence useful for binary classification.

• The function exhibits a steep gradient when x values are between -2 and 2. This
sensitivity means that small changes in input x can cause significant changes in output y,
which is critical during the training process.
Sigmoid or Logistic Activation Function Graph

2. Tanh Activation Function

Tanh function or hyperbolic tangent function, is a shifted version of the sigmoid, allowing it
to stretch across the y-axis. It is defined as:

f(x)=tanh⁡(x)=2/1+e−2x–1.

f(x)=tanh(x)=1+e−2x2–1.

Alternatively, it can be expressed using the sigmoid function:

tanh⁡(x)=2×sigmoid(2x)–1tanh(x)=2×sigmoid(2x)–1

• Value Range: Outputs values from -1 to +1.

• Non-linear: Enables modeling of complex data patterns.

• Use in Hidden Layers: Commonly used in hidden layers due to its zero-centered output,
facilitating easier learning for subsequent layers.

Tanh Activation Function

3. ReLU (Rectified Linear Unit) Function

ReLU activation is defined by A(x)=max⁡(0,x)A(x)=max(0,x), this means that if the input x is

positive, ReLU returns x, if the input is negative, it returns 0.

• Value Range: [0,∞)[0,∞), meaning the function only outputs non-negative values.

• Nature: It is a non-linear activation function, allowing neural networks to learn complex

patterns and making backpropagation more efficient.

• Advantage over other Activation: ReLU is less computationally expensive than tanh
and sigmoid because it involves simpler mathematical operations. At a time only a few
neurons are activated making the network sparse making it efficient and easy for
computation.

ReLU Activation Function

3. Exponential Linear Units

1. Softmax Function

Softmax function is designed to handle multi-class classification problems. It transforms raw

output scores from a neural network into probabilities. It works by squashing the output values of
each class into the range of 0 to 1, while ensuring that the sum of all probabilities equals 1.

• Softmax is a non-linear activation function.

• The Softmax function ensures that each class is assigned a probability, helping to identify
which class the input belongs to.
Softmax Activation Function

2. SoftPlus Function

Softplus function is defined mathematically as: A(x)=log⁡(1+ex)A(x)=log(1+ex). This

equation ensures that the output is always positive and differentiable at all points, which is an
advantage over the traditional ReLU function.

• Nature: The Softplus function is non-linear.

• Range: The function outputs values in the range (0,∞)(0,∞), similar to ReLU, but without
the hard zero threshold that ReLU has.

• Smoothness: Softplus is a smooth, continuous function, meaning it avoids the sharp

discontinuities of ReLU, which can sometimes lead to problems during optimization.

Softplus Activation Function

Impact of Activation Functions on Model Performance

The choice of activation function has a direct impact on the performance of a neural network in
several ways:

1. Convergence Speed: Functions like ReLU allow faster training by avoiding the
vanishing gradient problem, while Sigmoid and Tanh can slow down convergence in
deep networks.

2. Gradient Flow: Activation functions like ReLU ensure better gradient flow, helping
deeper layers learn effectively. In contrast, Sigmoid can lead to small gradients,
hindering learning in deep layers.

3. Model Complexity: Activation functions like Softmax allow the model to handle
complex multi-class problems, whereas simpler functions like ReLU or Leaky ReLU are
used for basic layers.

Conclusion

Activation functions are the backbone of neural networks, enabling them to capture non-linear
relationships in data. From classic functions like Sigmoid and Tanh to modern variants like
ReLU and Swish, each has its place in different types of neural networks. The key is to
understand their behavior and choose the right one based on your model’s needs.

• ReLU outputs the input directly if it’s positive, or zero otherwise, and is used in hidden
layers to speed up training.

• Softmax is used in the output layer for multi-class classification, converting raw outputs
into probabilities for each class.

What is the ReLU activation function?

ReLU is activation function that helps avoid vanishing gradients and computationally efficient in
deep learning.

What is the difference between ReLU and TANH?

ReLU outputs positive values directly and zero for negatives, while Tanh maps inputs between -
1 and 1. Tanh is zero-centered but suffers from vanishing gradients, unlike ReLU which does not
for positive values.

Department of Electronics & Electrical Engineering: Ec5245: Artificial Neural Network & Fuzzy Logic
No ratings yet
Department of Electronics & Electrical Engineering: Ec5245: Artificial Neural Network & Fuzzy Logic
51 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
Act Fun
No ratings yet
Act Fun
7 pages
Activation
No ratings yet
Activation
7 pages
Activation Function
No ratings yet
Activation Function
31 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
Activation Function
No ratings yet
Activation Function
9 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Deep Learning
No ratings yet
Deep Learning
10 pages
Unit 2
No ratings yet
Unit 2
35 pages
Activatn FN 2
No ratings yet
Activatn FN 2
10 pages
Fundamentals of Neural Network
No ratings yet
Fundamentals of Neural Network
84 pages
Activation Function
No ratings yet
Activation Function
43 pages
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages
Module-4 Neural Network
No ratings yet
Module-4 Neural Network
61 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Activation Functions
No ratings yet
Activation Functions
6 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
10 pages
Experiment No. 1 SL-II (ANN)
No ratings yet
Experiment No. 1 SL-II (ANN)
3 pages
Lecture 9-NN - Modified
No ratings yet
Lecture 9-NN - Modified
94 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Module1
No ratings yet
Module1
124 pages
3-Activation Function, Loss Function-24-07-2024
No ratings yet
3-Activation Function, Loss Function-24-07-2024
19 pages
Activation Functions in Neural Networks: What Is Activation Function?
No ratings yet
Activation Functions in Neural Networks: What Is Activation Function?
11 pages
DL Answers
No ratings yet
DL Answers
24 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Neural Network Example and Activation Functions Summary
No ratings yet
Neural Network Example and Activation Functions Summary
2 pages
SoftComp 02
No ratings yet
SoftComp 02
33 pages
Activation Function 1706811454
No ratings yet
Activation Function 1706811454
11 pages
Activation Funtions
No ratings yet
Activation Funtions
26 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Activation Functions
No ratings yet
Activation Functions
3 pages
Activation Function
No ratings yet
Activation Function
36 pages
Study of Ensemble of Activation Functions in Deep Learning
No ratings yet
Study of Ensemble of Activation Functions in Deep Learning
10 pages
Lec 22 Activations Functions Complete
No ratings yet
Lec 22 Activations Functions Complete
33 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
Fundamentals Deep Learning Activation Functions When To Use Them
No ratings yet
Fundamentals Deep Learning Activation Functions When To Use Them
15 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
Activation Function
No ratings yet
Activation Function
34 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
003 Activation Functions in Machine Learning
No ratings yet
003 Activation Functions in Machine Learning
19 pages
Types of Neural Network Activation Functions - How To Choose
No ratings yet
Types of Neural Network Activation Functions - How To Choose
36 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Unit 5 Activation Function
No ratings yet
Unit 5 Activation Function
15 pages
Lect 5 - Non Linear Activation Functions
No ratings yet
Lect 5 - Non Linear Activation Functions
41 pages
Activation Function
No ratings yet
Activation Function
4 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Activation Function: Deep Neural Networks
No ratings yet
Activation Function: Deep Neural Networks
47 pages
26 - Netinput Activation Function Forward and Back Propogation
No ratings yet
26 - Netinput Activation Function Forward and Back Propogation
41 pages
Pr1 ANN Writeup
No ratings yet
Pr1 ANN Writeup
7 pages
How To Choose An Activation Function For Deep Learning
No ratings yet
How To Choose An Activation Function For Deep Learning
15 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Deep Learnig
No ratings yet
Deep Learnig
16 pages
Aneja Convolutional Image Captioning CVPR 2018 Paper
No ratings yet
Aneja Convolutional Image Captioning CVPR 2018 Paper
10 pages
2021 Exam2 Solution
No ratings yet
2021 Exam2 Solution
11 pages
4 Implementing A GPT Model From Scratch To Generate Text - Build A Large Language Model (From Scratch)
No ratings yet
4 Implementing A GPT Model From Scratch To Generate Text - Build A Large Language Model (From Scratch)
52 pages
Unit 5
No ratings yet
Unit 5
39 pages
DL 4
No ratings yet
DL 4
11 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
2 pages
Activation Functions
No ratings yet
Activation Functions
34 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
13 Pretraining
No ratings yet
13 Pretraining
47 pages
Pre Print
No ratings yet
Pre Print
41 pages
2003.00130 - James Wallbridge - Transformers For Limit Order Books
No ratings yet
2003.00130 - James Wallbridge - Transformers For Limit Order Books
16 pages
Learning From Data: 10: Neural Networks - I
No ratings yet
Learning From Data: 10: Neural Networks - I
38 pages
CSA501 - QB Neural Network Deep Learning - Updated2024
No ratings yet
CSA501 - QB Neural Network Deep Learning - Updated2024
11 pages
Multi Layer Perceptron - Notes
No ratings yet
Multi Layer Perceptron - Notes
4 pages
What Is Backpropagation
No ratings yet
What Is Backpropagation
8 pages
19eid331 - Artificial Neural Networks
No ratings yet
19eid331 - Artificial Neural Networks
3 pages
Efficient nonlinear function approximation in analog resistive crossbars for recurrent neural networks的全文翻译
No ratings yet
Efficient nonlinear function approximation in analog resistive crossbars for recurrent neural networks的全文翻译
15 pages
Diabetes Detection Using Deep Learning Algorithms: ICT Express November 2018
No ratings yet
Diabetes Detection Using Deep Learning Algorithms: ICT Express November 2018
5 pages
Adaptive Linear Neuron
No ratings yet
Adaptive Linear Neuron
4 pages
DL Unit 3-5
No ratings yet
DL Unit 3-5
44 pages
Bab 7
No ratings yet
Bab 7
3 pages
CSM 422
No ratings yet
CSM 422
2 pages
Mini Project 1
No ratings yet
Mini Project 1
16 pages
ANN Self Slides
No ratings yet
ANN Self Slides
59 pages
Answer All Questions PART A - (5 2 10)
No ratings yet
Answer All Questions PART A - (5 2 10)
2 pages
Lecture 2
No ratings yet
Lecture 2
12 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
43 pages
d2l en PDF
100% (1)
d2l en PDF
1,024 pages

UNIT-III Activation-Function

Uploaded by

UNIT-III Activation-Function

Uploaded by

What is an Activation Function?

An activation function is a mathematical function applied to the output of a neuron. It

Why is Non-Linearity Important in Neural Networks?

Mathematical Proof of Need of Non-Linearity in Neural Networks

Types of Activation Functions in Deep Learning

1. Linear Activation Function

Non-Linear Activation Functions

Sigmoid Activation Function is characterized by ‘S’ shape. It is mathematically defined

2. Tanh Activation Function

Alternatively, it can be expressed using the sigmoid function:

• Value Range: Outputs values from -1 to +1.

• Non-linear: Enables modeling of complex data patterns.

Tanh Activation Function

ReLU activation is defined by A(x)=max⁡(0,x)A(x)=max(0,x), this means that if the input x is

• Nature: It is a non-linear activation function, allowing neural networks to learn complex

ReLU Activation Function

3. Exponential Linear Units

Softmax function is designed to handle multi-class classification problems. It transforms raw

• Softmax is a non-linear activation function.

Softplus function is defined mathematically as: A(x)=log⁡(1+ex)A(x)=log(1+ex). This

• Nature: The Softplus function is non-linear.

• Smoothness: Softplus is a smooth, continuous function, meaning it avoids the sharp

Softplus Activation Function

Impact of Activation Functions on Model Performance

What is the ReLU activation function?

What is the difference between ReLU and TANH?

You might also like