0% found this document useful (0 votes)

10 views15 pages

Activation FN

The document discusses the concepts of linearity and non-linearity in neural networks, explaining the roles of different activation functions such as ReLU, sigmoid, and tanh. It highlights the importance of non-linear activation functions for enabling complex learning tasks and outlines the structure of a neural network, including input, hidden, and output layers. Additionally, it covers stochastic functions and the capacity of perceptrons and neural networks, emphasizing the significance of activation functions in the learning process.

Uploaded by

trainy045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views15 pages

Activation FN

Uploaded by

trainy045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Linear & Non-Linear Units

Linearity refers to the property of a system or model where the output is directly
proportional to the input

while nonlinearity implies that the relationship between input and output is more
complex and cannot be expressed as a simple linear function

A Rectified Linear Unit (ReLU) is a form of activation function used commonly in

deep learning models.

The function returns 0 if it receives a negative input, and if it receives a positive

value, the function will return back the same positive value.
Linear Classification refers to categorizing a set of data points into a discrete class based
on a linear combination of its explanatory variables.

Non-Linear Classification refers to categorizing those instances that are not linearly
separable. It is possible to classify data with a straight line.

The linear transfer function calculates the neuron's output by simply returning the value
passed to it.

This neuron can be trained to learn an affine function of its inputs, or to find a linear
approximation to a nonlinear function.

A linear network cannot, be made to perform a nonlinear computation.

Elements of a Neural Network

Input Layer: This layer accepts input features. It provides information from the outside
world to the network, no computation is performed at this layer, nodes here just pass
on the information(features) to the hidden layer.

Hidden Layer: Nodes of this layer are not exposed to the outer world, they are part of
the abstraction provided by any neural network. The hidden layer performs all sorts of
computation on the features entered through the input layer and transfers the result to
the output layer.

Output Layer: This layer bring up the information learned by the network to the outer
world.
What is an activation function and why use them?

The activation function decides whether a neuron should be activated or not by

calculating the weighted sum and further adding bias to it. The purpose of the
activation function is to introduce non-linearity into the output of a neuron.

We know, the neural network has neurons that work in correspondence with weight,
bias, and their respective activation function.

In a neural network, we would update the weights and biases of the neurons on the
basis of the error at the output. This process is known as back propagation.

Activation functions make the back-propagation possible since the gradients are
supplied along with the error to update the weights and biases
Why do we need Non-linear activation function?
A neural network without an activation function is essentially just a linear regression
model.

The activation function does the non-linear transformation to the input making it
capable to learn and perform more complex tasks.
Mathematical proof

Elements of the diagram are as follows:

Hidden layer i.e. layer 1:
z(1) = W(1)X + b(1) a(1)
Here,
z(1) is the vectorized output of layer 1
W(1) be the vectorized weights assigned to
neurons of hidden layer i.e. w1, w2, w3 and
w4
X be the vectorized input features i.e. i1 and
i2
b is the vectorized bias assigned to neurons Layer 2 i.e. output layer :-
in hidden layer i.e. b1 and b2 Note : Input for layer 2 is
a(1) is the vectorized form of any linear output from layer 1
function. z(2) = W(2)a(1) + b(2)
(Note: We are not considering activation a(2) = z(2)
function here)
Calculation at Output layer
z(2) = (W(2) * [W(1)X + b(1)]) + b(2)
z(2) = [W(2) * W(1)] * X + [W(2)*b(1) + b(2)]
Let,
[W(2) * W(1)] = W
[W(2)*b(1) + b(2)] = b
Final output : z(2) = W*X + b
which is again a linear function
This observation results again in a linear function even after applying a hidden layer,
hence we can conclude that, doesn’t matter how many hidden layer we attach in
neural net, all layers will behave same way because the composition of two linear
functions is a linear function itself.

Neuron can not learn with just a linear function attached to it. A non-linear activation
function will let it learn as per the difference w.r.t error. Hence we need an activation
function.
Variants of Activation Function

Linear Function
Equation : Linear function has the equation similar to as of a straight line i.e. y = x
No matter how many layers we have, if all are linear in nature, the final activation
function of last layer is nothing but just a linear function of the input of first layer.
Range : -inf to +inf

Uses : Linear activation function is used at just one place i.e. output layer.

For example : Calculation of price of a house is a regression problem. House price may
have any big/small value, so we can apply linear activation at output layer. Even in this
case neural net must have any non-linear function at hidden layers.
Sigmoid Function
It is a function which is plotted as ‘S’ shaped graph.
Equation : A = 1/(1 + e-x)
Nature : Non-linear. Notice that X values lies between -2 to 2, Y values are very steep.
This means, small changes in x would also bring about large changes in the value of Y.
Value Range : 0 to 1
Uses : Usually used in output layer of a binary classification, where result is either 0 or
1, as value for sigmoid function lies between 0 and 1 only so, result can be predicted
easily to be 1 if value is greater than 0.5 and 0 otherwise.
Tanh Function

The activation that works almost always better than sigmoid function is Tanh function
also known as Tangent Hyperbolic function. It’s actually mathematically shifted
version of the sigmoid function. Both are similar and can be derived from each other.

Value Range :- -1 to +1
Nature :- non-linear
Uses :- Usually used in hidden layers of a neural network as it’s values lies between
-1 to 1 hence the mean for the hidden layer comes out be 0 or very close to it,
hence helps in centering the data by bringing mean close to 0. This makes learning
for the next layer much easier.
RELU Function

It Stands for Rectified linear unit. It is the most widely used activation function. Mainly
implemented in hidden layers of Neural network.
Equation :- A(x) = max(0,x). It gives an output x if x is positive and 0 otherwise.
Value Range :- [0, inf)

Nature :- non-linear, which means we can easily backpropagate the errors and have
multiple layers of neurons being activated by the ReLU function.

Uses :- ReLu is less computationally expensive than tanh and sigmoid because it
involves simpler mathematical operations. At a time only a few neurons are activated
making the network sparse making it efficient and easy for computation.
In simple words, RELU learns much faster than sigmoid and Tanh function.

A rectified linear unit (ReLU) is an activation function that introduces the property
of non-linearity to a deep learning model and solves the vanishing gradients issue. "It
interprets the positive part of its argument. It is one of the most popular activation
functions in deep learning.
Stochastic (random) function
A stochastic (random) function X(t) is a many-valued numerical function of an
independent argument t, whose value for any fixed value t ∈ T (where T is the domain of
the argument) is a random variable, called a cut set .

In stochastic neural networks, the algorithm instead of providing deterministic values

to each neurons it assigns probabilities to each neuron.

If each neuron passes the threshold values then only the neurons will fire.

It is built by introducing random variation into the network and by giving stochastic
weights.
Stochastic modeling forecasts the probability of various outcomes under different
conditions, using random variables.

Stochastic modeling presents data and predicts outcomes that account for certain levels
of unpredictability or randomness.

With fixed input the output of stochastic neural net is likely to be different
(stochastic, or random to certain extent) for multiple evaluations

In contrast to deterministic neural networks, where for fixed input the output is also
unique (deterministic).
What is the capacity of a perceptron?
From an information theory point of view, a single perceptron with K inputs has a
capacity of 2K bits of information.

What is the capacity of a neural network?

Neural networks are defined at various levels of abstraction, and thus it models different
aspects of neural networks. Therefore the network capacity is nothing but the levels of
abstraction or the number of fundamental memories or the number of patterns that can
be stored and recalled in a network.

What is the perceptron convergence procedure?

Perceptron Convergence Theorem: For any finite set of linearly separable labeled
examples, the Perceptron Learning Algorithm will halt after a finite number of
iterations. In other words, after a finite number of iterations, the algorithm yields a
vector w that classifies perfectly all the examples

Artificial Neural Networks (ANN)
No ratings yet
Artificial Neural Networks (ANN)
67 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
Activation Function
No ratings yet
Activation Function
43 pages
Activation Functions
No ratings yet
Activation Functions
11 pages
Module-4 Neural Network
No ratings yet
Module-4 Neural Network
61 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
ANNs
No ratings yet
ANNs
57 pages
Unit 2 - Activation Function - PR
No ratings yet
Unit 2 - Activation Function - PR
22 pages
Soft Computing Manual.-1
No ratings yet
Soft Computing Manual.-1
45 pages
Fundamentals of Neural Network
No ratings yet
Fundamentals of Neural Network
84 pages
Activation Function
No ratings yet
Activation Function
31 pages
Activation Function 1706811454
No ratings yet
Activation Function 1706811454
11 pages
Activation Function
No ratings yet
Activation Function
34 pages
Activation Function
No ratings yet
Activation Function
36 pages
Neural Network Example and Activation Functions Summary
No ratings yet
Neural Network Example and Activation Functions Summary
2 pages
Unit 2 - Machine Learning
No ratings yet
Unit 2 - Machine Learning
19 pages
Artificial Neural Artificial Neural Networks
No ratings yet
Artificial Neural Artificial Neural Networks
40 pages
Aditya Jain NN Assignment
No ratings yet
Aditya Jain NN Assignment
13 pages
Lec 22 Activations Functions Complete
No ratings yet
Lec 22 Activations Functions Complete
33 pages
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
Unit 5 Activation Function
No ratings yet
Unit 5 Activation Function
15 pages
Unit 2
No ratings yet
Unit 2
35 pages
Activation Function
No ratings yet
Activation Function
9 pages
Unit V Neural Networks
No ratings yet
Unit V Neural Networks
35 pages
26 - Netinput Activation Function Forward and Back Propogation
No ratings yet
26 - Netinput Activation Function Forward and Back Propogation
41 pages
Activation Function
No ratings yet
Activation Function
4 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Activation Function
No ratings yet
Activation Function
44 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
DL M2 Tech
No ratings yet
DL M2 Tech
32 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Activatn FN 2
No ratings yet
Activatn FN 2
10 pages
Neural Network Notes
No ratings yet
Neural Network Notes
8 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Module1
No ratings yet
Module1
124 pages
ML Lec-22
No ratings yet
ML Lec-22
25 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
UNIT-III Activation-Function
No ratings yet
UNIT-III Activation-Function
6 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
How To Choose An Activation Function For Deep Learning
No ratings yet
How To Choose An Activation Function For Deep Learning
15 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
Activation
No ratings yet
Activation
7 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Unit 2
No ratings yet
Unit 2
18 pages
Unit V
No ratings yet
Unit V
9 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
Fake News Detection
40% (10)
Fake News Detection
71 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
0905 Cs 161183 Vishal
No ratings yet
0905 Cs 161183 Vishal
38 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
Introduction To Industry 4.0 and Industrial IoT Week 8 Quiz Solutions
No ratings yet
Introduction To Industry 4.0 and Industrial IoT Week 8 Quiz Solutions
5 pages
Study Plan South East University
100% (1)
Study Plan South East University
5 pages
Intelligent Control Syllabus Updated
No ratings yet
Intelligent Control Syllabus Updated
3 pages
Ai Theory Curriculum
No ratings yet
Ai Theory Curriculum
8 pages
Iitcee 2024 Toc
No ratings yet
Iitcee 2024 Toc
46 pages
BTech 1ITSyllabusFinalYearVer1.1-1
No ratings yet
BTech 1ITSyllabusFinalYearVer1.1-1
76 pages
Sem 6 Syllabus
No ratings yet
Sem 6 Syllabus
17 pages
2) Wajcman, J. (2017) - Automation - Is It Really Different This Time - The British Journal of Sociology, 68 (1), 119-127.
No ratings yet
2) Wajcman, J. (2017) - Automation - Is It Really Different This Time - The British Journal of Sociology, 68 (1), 119-127.
9 pages
The Path To AI Maturity 2024
No ratings yet
The Path To AI Maturity 2024
37 pages
AI Legal Document Analyzer Presentation
No ratings yet
AI Legal Document Analyzer Presentation
13 pages
Multi-Label Feature Aware XGBoost Model For Student Performance Assessment Using Behavior Data in Online Learning Environment
No ratings yet
Multi-Label Feature Aware XGBoost Model For Student Performance Assessment Using Behavior Data in Online Learning Environment
7 pages
Data Mining For Building Knowledge Bases: Techniques, Architectures and Applications
No ratings yet
Data Mining For Building Knowledge Bases: Techniques, Architectures and Applications
27 pages
Recent Research On AI in Games
No ratings yet
Recent Research On AI in Games
47 pages
Mcculloch Pittsneuron
No ratings yet
Mcculloch Pittsneuron
16 pages
Comparative Analysis of Machine Learning Models For Predicting Bitcoin Price Rate
No ratings yet
Comparative Analysis of Machine Learning Models For Predicting Bitcoin Price Rate
5 pages
Notification
No ratings yet
Notification
7 pages
Radar Target Detection
No ratings yet
Radar Target Detection
20 pages
Activation Functions Book
No ratings yet
Activation Functions Book
20 pages
Ict Interview Questions
No ratings yet
Ict Interview Questions
5 pages
Bhushan Resume
No ratings yet
Bhushan Resume
3 pages
Environmental Pollution Analysis and Prediction of Influential Factors: A Data-Driven Investigation
No ratings yet
Environmental Pollution Analysis and Prediction of Influential Factors: A Data-Driven Investigation
14 pages
Out-of-Distribution Detection With Deep Nearest Neighbors: Lee Et Al. 2018 Tack Et Al. 2020 Sehwag Et Al. 2021
No ratings yet
Out-of-Distribution Detection With Deep Nearest Neighbors: Lee Et Al. 2018 Tack Et Al. 2020 Sehwag Et Al. 2021
14 pages
ShivaniGupta (8 0)
No ratings yet
ShivaniGupta (8 0)
2 pages
Beyond English: Evaluating Llms For Arabic Grammatical Error Correction
No ratings yet
Beyond English: Evaluating Llms For Arabic Grammatical Error Correction
19 pages
Machine-Learning-Algorith 8800284 Powerpoint
No ratings yet
Machine-Learning-Algorith 8800284 Powerpoint
10 pages
(ICML 2022) Transformer Neural Processes - Uncertainty-Aware Meta Learning Via Sequence Modeling
No ratings yet
(ICML 2022) Transformer Neural Processes - Uncertainty-Aware Meta Learning Via Sequence Modeling
19 pages
Project Proposal: COMSATS University Islamabad, Park Road, Chak Shahzad, Islamabad Pakistan
No ratings yet
Project Proposal: COMSATS University Islamabad, Park Road, Chak Shahzad, Islamabad Pakistan
22 pages
Integrating Artificial and Human Intelligence: A Partnership For Responsible Innovation in Biomedical Engineering and Medicine
No ratings yet
Integrating Artificial and Human Intelligence: A Partnership For Responsible Innovation in Biomedical Engineering and Medicine
18 pages
Comparison of Naive Bayes Classifier and C-LSTM
No ratings yet
Comparison of Naive Bayes Classifier and C-LSTM
6 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet