0% found this document useful (0 votes)

15 views10 pages

Activatn FN 2

The document discusses linear and non-linear models, focusing on their definitions, applications, and the importance of activation functions in neural networks. It explains various activation functions such as ReLU, sigmoid, and softmax, detailing their characteristics and uses in different layers of neural networks. Additionally, it covers stochastic neural networks, their differences from deterministic networks, and the implications of randomness in neural network training and output.

Uploaded by

trainy045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views10 pages

Activatn FN 2

Uploaded by

trainy045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Linear & Non Linear Models

Linearity refers to the property of a system or model where the output is directly proportional
to the input, while nonlinearity implies that the relationship between input and output is more
complex and cannot be expressed as a simple linear function

A Rectified Linear Unit is a form of activation function used commonly in deep learning
models. In essence, the function returns 0 if it receives a negative input, and if it receives a
positive value, the function will return back the same positive value.

Linear Classification refers to categorizing a set of data points into a discrete class based on
a linear combination of its explanatory variables. Non-Linear Classification refers to
categorizing those instances that are not linearly separable. It is possible to classify data
with a straight line.
The linear transfer function calculates the neuron's output by simply returning the value
passed to it. This neuron can be trained to learn an affine function of its inputs, or to find a
linear approximation to a nonlinear function. A linear network cannot, of course, be made to
perform a nonlinear computation.
A nonlinear neural network is a neural network that uses nonlinear transformations in its
layers, such as activation functions, convolution, or pooling. An activation function is a
function that adds nonlinearity to the output of a neuron, such as a sigmoid, tanh, or relu
function.
A nonlinear model describes nonlinear relationships in experimental data. Nonlinear
regression models are generally assumed to be parametric, where the model is described as
a nonlinear equation. Typically machine learning methods are used for non-parametric
nonlinear regression.

Activation functions in Neural Networks



It is recommended to understand Neural Networks before reading this article.

In the process of building a neural network, one of the choices you get to
make is what Activation Function to use in the hidden layer as well as at the
output layer of the network. This article discusses some of the choices.
Elements of a Neural Network
Input Layer: This layer accepts input features. It provides information from
the outside world to the network, no computation is performed at this layer,
nodes here just pass on the information(features) to the hidden layer.
Hidden Layer: Nodes of this layer are not exposed to the outer world, they
are part of the abstraction provided by any neural network. The hidden layer
performs all sorts of computation on the features entered through the input
layer and transfers the result to the output layer.
Output Layer: This layer bring up the information learned by the network to
the outer world.
What is an activation function and why use them?
The activation function decides whether a neuron should be activated or not
by calculating the weighted sum and further adding bias to it. The purpose of
the activation function is to introduce non-linearity into the output of a neuron.
Explanation: We know, the neural network has neurons that work in
correspondence with weight, bias, and their respective activation function. In a
neural network, we would update the weights and biases of the neurons on
the basis of the error at the output. This process is known as back-
propagation. Activation functions make the back-propagation possible since
the gradients are supplied along with the error to update the weights and
biases.
Why do we need Non-linear activation function?
A neural network without an activation function is essentially just a linear
regression model. The activation function does the non-linear transformation
to the input making it capable to learn and perform more complex tasks.
Mathematical proof
Suppose we have a Neural net like this :-

Elements of the diagram are as follows:

Hidden layer i.e. layer 1:
z(1) = W(1)X + b(1) a(1)
Here,
 z(1) is the vectorized output of layer 1
 W(1) be the vectorized weights assigned to neurons of hidden layer i.e. w1,
w2, w3 and w4
 X be the vectorized input features i.e. i1 and i2
 b is the vectorized bias assigned to neurons in hidden layer i.e. b1 and b2
 a(1) is the vectorized form of any linear function.
(Note: We are not considering activation function here)

Layer 2 i.e. output layer :-

Note : Input for layer 2 is output from layer 1
z(2) = W(2)a(1) + b(2)
a(2) = z(2)
Calculation at Output layer
z(2) = (W(2) * [W(1)X + b(1)]) + b(2)
z(2) = [W(2) * W(1)] * X + [W(2)*b(1) + b(2)]
Let,
[W(2) * W(1)] = W
[W(2)*b(1) + b(2)] = b
Final output : z(2) = W*X + b
which is again a linear function
This observation results again in a linear function even after applying a hidden
layer, hence we can conclude that, doesn’t matter how many hidden layer we
attach in neural net, all layers will behave same way because the
composition of two linear function is a linear function itself. Neuron can
not learn with just a linear function attached to it. A non-linear activation
function will let it learn as per the difference w.r.t error. Hence we need an
activation function.
Variants of Activation Function
Linear Function
 Equation : Linear function has the equation similar to as of a straight line
i.e. y = x
 No matter how many layers we have, if all are linear in nature, the final
activation function of last layer is nothing but just a linear function of the
input of first layer.
 Range : -inf to +inf
 Uses : Linear activation function is used at just one place i.e. output
layer.
 Issues : If we will differentiate linear function to bring non-linearity, result
will no more depend on input “x” and function will become constant, it won’t
introduce any ground-breaking behavior to our algorithm.
For example : Calculation of price of a house is a regression problem. House
price may have any big/small value, so we can apply linear activation at output
layer. Even in this case neural net must have any non-linear function at hidden
layers.
Sigmoid Function
 It is a function which is plotted as ‘S’ shaped graph.
 Equation : A = 1/(1 + e )
-x

 Nature : Non-linear. Notice that X values lies between -2 to 2, Y values are

very steep. This means, small changes in x would also bring about large
changes in the value of Y.
 Value Range : 0 to 1
 Uses : Usually used in output layer of a binary classification, where result
is either 0 or 1, as value for sigmoid function lies between 0 and 1 only so,
result can be predicted easily to be 1 if value is greater
than 0.5 and 0 otherwise.
Tanh Function
 The activation that works almost always better than sigmoid function is
Tanh function also known as Tangent Hyperbolic function. It’s actually
mathematically shifted version of the sigmoid function. Both are similar and
can be derived from each other.
 Equation :-

Value Range :- -1 to +1

 Nature :- non-linear
 Uses :- Usually used in hidden layers of a neural network as it’s values lies
between -1 to 1 hence the mean for the hidden layer comes out be 0 or
very close to it, hence helps in centering the data by bringing mean close
to 0. This makes learning for the next layer much easier.
RELU Function
 It Stands for Rectified linear unit. It is the most widely used activation
function. Chiefly implemented in hidden layers of Neural network.
 Equation :- A(x) = max(0,x). It gives an output x if x is positive and 0
otherwise.
 Value Range :- [0, inf)
 Nature :- non-linear, which means we can easily backpropagate the errors
and have multiple layers of neurons being activated by the ReLU function.
 Uses :- ReLu is less computationally expensive than tanh and sigmoid
because it involves simpler mathematical operations. At a time only a few
neurons are activated making the network sparse making it efficient and
easy for computation.
In simple words, RELU learns much faster than sigmoid and Tanh function.

Is ReLU linear or non-linear?

A rectified linear unit (ReLU) is an activation function that introduces the property
of non-linearity to a deep learning model and solves the vanishing gradients issue. "It
interprets the positive part of its argument. It is one of the most popular activation
functions in deep learning.

Softmax Function

The softmax function is also a type of sigmoid function but is handy when we
are trying to handle multi- class classification problems.
 Nature :- non-linear
 Uses :- Usually used when trying to handle multiple classes. the softmax
function was commonly found in the output layer of image classification
problems.The softmax function would squeeze the outputs for each class
between 0 and 1 and would also divide by the sum of the outputs.
 Output:- The softmax function is ideally used in the output layer of the
classifier where we are actually trying to attain the probabilities to define
the class of each input.
 The basic rule of thumb is if you really don’t know what activation function
to use, then simply use RELU as it is a general activation function in
hidden layers and is used in most cases these days.
 If your output is for binary classification then, sigmoid function is very
natural choice for output layer.
 If your output is for multi-class classification then, Softmax is very useful to
predict the probabilities of each classes.

A stochastic (random) function X(t) is a many-valued numerical function of an independent

argument t, whose value for any fixed value t ∈ T (where T is the domain of the argument) is
a random variable, called a cut set .

stochastic is well-described by a random probability distribution.

In stochastic neural networks, the algorithm instead of providing

deterministic values to each neurons it assigns probabilities to each
neuron. If each neuron passes the threshold values then only the
neurons will fire. It is built by introducing random variation into the
network and by giving stochastic weights.
Using Stochastic Processes is a key tool in real-time mathematical model of systems which
has a continuous random varying nature. It has a wide beach of applications ranging from
Image Processing, Neuroscience, Bio Informatics, Financial Management, Statistics, etc.

A variable or process is stochastic if there is uncertainty or randomness involved in the

outcomes. Stochastic is a synonym for random and probabilistic, although is different from
non-deterministic. Many machine learning algorithms are stochastic because they explicitly
use randomness during optimization or learning.
Stochastic modeling forecasts the probability of various outcomes under different conditions,
using random variables. Stochastic modeling presents data and predicts outcomes that
account for certain levels of unpredictability or randomness.
Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an
objective function with suitable smoothness properties (e.g. differentiable or
subdifferentiable).

Quora

then the main difference is that with fixed input the output of stochastic neural net is
likely to be different (stochastic, or random to certain extent) for multiple evaluations, in
contrast to deterministic neural networks, where for fixed input the output is also unique
(deterministic).

Such neural networks are useful if you want to model behavior of partially random
systems. Imagine that you set up an experiment where you show a picture and ask to
name one thing human see on the image (and there are many different things on the
image). You can anticipate a set of answers that human will give for a certain image, but
you cannot say precisely what specific answer will be given by a human. Therefore if you
would like to model such human behavior, you would prefer stochastic neural network to
do so.

Related
Are neural networks stochastic or deterministic?

After training has been completed, then the internal workings of a neural network are
deterministic, not stochastic.

A neural network is essentially a mathematical structure that transforms one data object
applied to the input end into another data object which appears at the output end.

If we are thinking about determinism, then a neural network is no different to this

completely made-up function: y(x) = [3x^3 - 1.8x^2 + sin(3x/4)] / 6.5exp(4x + 3).

y(x) will always return the same result when x=0.3447 which will a real number.

If you wrote out the equation for a neural network like this then it would be extremely
complex, but it would produce deterministic results just the same: you would only need
to apply a given data structure to the input end once. You do have to apply the same
input again and again and analyse the distribution of results. You only get one result.

However, the training algorithm is not deterministic, which means that the parameter
values you get after one training process are very likely to be different to those that you
will get after another training process - even when the training data is the same. This is
actually why the training of a complex neural network is a bit of an art and can often
involve a fair bit of trail and error as some training journeys either lead to poor results or
fail to converge.

But let’s return to how the network works after it has been trained.
The picture becomes more subtle when we have deliberately designed y(x) to return a
statistical parameter. So we might want y(x) to represent the confidence level that a
given input (e.g. an image) contains a ‘stop’ sign an so the value of y(x) might in this
case range from 0 to 1.

So a more complete answer to your question would be that after it has been trained, a
neural network is intrinsically deterministic - but we might interpret the output it
generates scholastically.
Deterministic update: If the activation value exceeds the threshold, the node/neuron
fires.

Stochastic update: If the activation value exceeds the threshold, there is a probability
associated with firing. That is there is a probability of the neuron not firing even if it
exceeds the threshold.

If the probability is one then that update is Deterministic.

What is the capacity of a perceptron?

From an information theory point of view, a single perceptron with K inputs has a
capacity of 2K bits of information.

What is the capacity of a neural network?

Neural networks are defined at various levels of abstraction, and thus it models
different aspects of neural networks. Therefore the network capacity is nothing
but the levels of abstraction or the number of fundamental memories or the number
of patterns that can be stored and recalled in a network.

What are the limitations of simple perceptron?

The following are the limitation of a Perceptron model:
 The output of a perceptron can only be a binary number (0 or 1) due to the hard-
edge transfer function.
 It can only be used to classify the linearly separable sets of input vectors. If the input
vectors are non-linear, it is not easy to classify them correctly.

 What is the perceptron convergence procedure?

 Perceptron Convergence Theorem: For any finite set of linearly separable
labeled examples, the Perceptron Learning Algorithm will halt after a finite
number of iterations. In other words, after a finite number of iterations, the
algorithm yields a vector w that classifies perfectly all the examples

BR235 - EN - Col18 SAP Convergent Charging
No ratings yet
BR235 - EN - Col18 SAP Convergent Charging
195 pages
Lecture 1-1 Introduction To Digital Systems
No ratings yet
Lecture 1-1 Introduction To Digital Systems
16 pages
Realtime Festival Overview
No ratings yet
Realtime Festival Overview
28 pages
Form B Level 200
No ratings yet
Form B Level 200
1 page
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Axis Mobile Features
No ratings yet
Axis Mobile Features
34 pages
Bug Bounty Training Program
No ratings yet
Bug Bounty Training Program
28 pages
Bcs-41 Jadi Buti
No ratings yet
Bcs-41 Jadi Buti
3 pages
Activation Function 1706811454
No ratings yet
Activation Function 1706811454
11 pages
FCI ANSIFCI 70-3 - Standard For Regulator Seat Leakage Testing
No ratings yet
FCI ANSIFCI 70-3 - Standard For Regulator Seat Leakage Testing
5 pages
Environmental Engineering Project S3 1st Semester 2024
No ratings yet
Environmental Engineering Project S3 1st Semester 2024
3 pages
STP 1571-2014
No ratings yet
STP 1571-2014
184 pages
Ml-15 Work Procedure For Row Clean Up & Restoration
No ratings yet
Ml-15 Work Procedure For Row Clean Up & Restoration
8 pages
Raptor 2024
No ratings yet
Raptor 2024
8 pages
Lec 22 Activations Functions Complete
No ratings yet
Lec 22 Activations Functions Complete
33 pages
Solar Photovoltaic Glint and Glare Guidance First Edition
No ratings yet
Solar Photovoltaic Glint and Glare Guidance First Edition
55 pages
Module-4 Neural Network
No ratings yet
Module-4 Neural Network
61 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
Activation Function
No ratings yet
Activation Function
34 pages
Unit 2
No ratings yet
Unit 2
35 pages
Artificial Neural Artificial Neural Networks
No ratings yet
Artificial Neural Artificial Neural Networks
40 pages
PC 7 MK Ii Limitation
No ratings yet
PC 7 MK Ii Limitation
2 pages
Neural Network Example and Activation Functions Summary
No ratings yet
Neural Network Example and Activation Functions Summary
2 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
36 pages
EQUIP9-Operations-Use Case Challenge
No ratings yet
EQUIP9-Operations-Use Case Challenge
6 pages
Valet Parking Management System - Requirement Specification - V1
No ratings yet
Valet Parking Management System - Requirement Specification - V1
14 pages
Apple Iphone Field Test Mode PDF
No ratings yet
Apple Iphone Field Test Mode PDF
3 pages
Unit 2 - Activation Function - PR
No ratings yet
Unit 2 - Activation Function - PR
22 pages
Fundamentals of Neural Network
No ratings yet
Fundamentals of Neural Network
84 pages
ANNs
No ratings yet
ANNs
57 pages
Running Head: DATA STRUCTURES 1: Course: Project Name: Student Name: Date
No ratings yet
Running Head: DATA STRUCTURES 1: Course: Project Name: Student Name: Date
7 pages
UNIT-III Activation-Function
No ratings yet
UNIT-III Activation-Function
6 pages
Activation FN
No ratings yet
Activation FN
15 pages
Application Form
No ratings yet
Application Form
8 pages
Information and Software Technology: Andrew Austin, Casper Holmgreen, Laurie Williams
No ratings yet
Information and Software Technology: Andrew Austin, Casper Holmgreen, Laurie Williams
10 pages
Activation Function
No ratings yet
Activation Function
9 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Food Irradiation: Communication Strategies To Bridge The Gap Between Scientists and The Public
No ratings yet
Food Irradiation: Communication Strategies To Bridge The Gap Between Scientists and The Public
10 pages
102-00094-I RIO ZUNI Operators Manual
No ratings yet
102-00094-I RIO ZUNI Operators Manual
46 pages
Department of Civil Engineering (Bbit B.Tech Wing) A.Y. 2019-2020 Even Semester Faculty Database For Online Internal Exam in May 2020
No ratings yet
Department of Civil Engineering (Bbit B.Tech Wing) A.Y. 2019-2020 Even Semester Faculty Database For Online Internal Exam in May 2020
1 page
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
Lect 5 - Non Linear Activation Functions
No ratings yet
Lect 5 - Non Linear Activation Functions
41 pages
Types of Neural Network Activation Functions - How To Choose
No ratings yet
Types of Neural Network Activation Functions - How To Choose
36 pages
FVM Crash Intro
No ratings yet
FVM Crash Intro
202 pages
Mil 1ST Sem 2ND Quarter Week 1
No ratings yet
Mil 1ST Sem 2ND Quarter Week 1
8 pages
Act Fun
No ratings yet
Act Fun
7 pages
3-Activation Function, Loss Function-24-07-2024
No ratings yet
3-Activation Function, Loss Function-24-07-2024
19 pages
Activation Function
No ratings yet
Activation Function
43 pages
Let's Create Local and Global Ads: Activity 2: Home Business! Lead in
100% (1)
Let's Create Local and Global Ads: Activity 2: Home Business! Lead in
3 pages
Activation
No ratings yet
Activation
7 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
Unit V Neural Networks
No ratings yet
Unit V Neural Networks
35 pages
Question 2.21: What Are The Reasons of Using Load Equalisation in The Electric Drive? Answer
No ratings yet
Question 2.21: What Are The Reasons of Using Load Equalisation in The Electric Drive? Answer
1 page
26 - Netinput Activation Function Forward and Back Propogation
No ratings yet
26 - Netinput Activation Function Forward and Back Propogation
41 pages
ML Lec-22
No ratings yet
ML Lec-22
25 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Activation Funtions
No ratings yet
Activation Funtions
26 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
Specifications Guide Electric Range EN
No ratings yet
Specifications Guide Electric Range EN
2 pages
DL Answers
No ratings yet
DL Answers
24 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Experiment No. 1 SL-II (ANN)
No ratings yet
Experiment No. 1 SL-II (ANN)
3 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
Unit 5 Activation Function
No ratings yet
Unit 5 Activation Function
15 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
2001 Chevy S10 T10 Blazer Distributor Replacement REMOVAL PROCEDURE
50% (2)
2001 Chevy S10 T10 Blazer Distributor Replacement REMOVAL PROCEDURE
7 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Prepare Installer Lesson
No ratings yet
Prepare Installer Lesson
25 pages
Activation Function
No ratings yet
Activation Function
36 pages
Activation Function
No ratings yet
Activation Function
44 pages
Feed Forward NN
No ratings yet
Feed Forward NN
35 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Activation Function
No ratings yet
Activation Function
4 pages
Activation Function
No ratings yet
Activation Function
31 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Aditya Jain NN Assignment
No ratings yet
Aditya Jain NN Assignment
13 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
12 Types of Neural Network Activation Functions
No ratings yet
12 Types of Neural Network Activation Functions
38 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Module1
No ratings yet
Module1
124 pages
How To Choose An Activation Function For Deep Learning
No ratings yet
How To Choose An Activation Function For Deep Learning
15 pages
0905 Cs 161183 Vishal
No ratings yet
0905 Cs 161183 Vishal
38 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages

Activatn FN 2

Uploaded by

Activatn FN 2

Uploaded by

Linear & Non Linear Models

Activation functions in Neural Networks

It is recommended to understand Neural Networks before reading this article.

Elements of the diagram are as follows:

Layer 2 i.e. output layer :-

 Nature : Non-linear. Notice that X values lies between -2 to 2, Y values are

Is ReLU linear or non-linear?

A stochastic (random) function X(t) is a many-valued numerical function of an independent

stochastic is well-described by a random probability distribution.

In stochastic neural networks, the algorithm instead of providing

A variable or process is stochastic if there is uncertainty or randomness involved in the

If we are thinking about determinism, then a neural network is no different to this

If the probability is one then that update is Deterministic.

What is the capacity of a perceptron?

What is the capacity of a neural network?

What are the limitations of simple perceptron?

 What is the perceptron convergence procedure?

You might also like