Activation Functions in Neural Networks - GeeksforGeeks
Activation Functions in Neural Networks - GeeksforGeeks
Activation Functions in Neural Networks - GeeksforGeeks
us!
Activation functions in
Neural Networks
Is it okay to be an Average
Student?
⇣
What is Competitive
Programming and How to
Prepare for It?
Difference Between
Programming, Scripting, and
Markup Languages
7 Common Programming
Principles That Every
Developer Must Follow
SOLID Principle in
Programming: Understand
With Real Life Examples
De nition of activation function:- Activation function decides, whether a neuron should be activated or not by calculating weighted sum and further adding bias with it. The
purpose of the activation function is to introduce non-linearity into the output of a neuron.
Explanation :-
We know, neural network has neurons that work in correspondence of weight, bias and their respective activation function. In a neural network, we would update the weights
and biases of the neurons on the basis of the error at the output. This process is known as back-propagation. Activation functions make the back-propagation possible since
the gradients are supplied along with the error to update the weights and biases.
Mathematical proof :-
Let,
[W(2) * W(1)] = W
[W(2)*b(1) + b(2)] = b
Final output : z(2) = W*X + b
Which is again a linear function
This observation results again in a linear function even after applying a hidden layer, hence we can conclude that, doesn’t matter how many hidden layer we attach in neural
net, all layers will behave same way because the composition of two linear function is a linear function itself. Neuron can not learn with just a linear function attached to it. A
non-linear activation function will let it learn as per the difference w.r.t error.
Hence we need activation function.
Equation : Linear function has the equation similar to as of a straight line i.e. y = ax
No matter how many layers we have, if all are linear in nature, the nal activation function of last layer is nothing but just a linear function of the input of rst layer.
Range : -inf to +inf
Uses : Linear activation function is used at just one place i.e. output layer.
⇣
Issues : If we will differentiate linear function to bring non-linearity, result will no more depend on input “x” and function will become constant, it won’t introduce any
ground-breaking behavior to our algorithm.
For example : Calculation of price of a house is a regression problem. House price may have any big/small value, so we can apply linear activation at output layer. Even in this
case neural net must have any non-linear function at hidden layers.
3). Tanh Function :- The activation that works almost always better than sigmoid function is Tanh function also knows as Tangent Hyperbolic function. It’s actually
mathematically shifted version of the sigmoid function. Both are similar and can be derived from each other.
Equation :-
Value Range :- -1 to +1
Nature :- non-linear
Uses :- Usually used in hidden layers of a neural network as it’s values lies between -1 to 1 hence the mean for the hidden layer comes out be 0 or very close to it, hence
helps in centering the data by bringing mean close to 0. This makes learning for the next layer much easier.
4). RELU :- Stands for Recti ed linear unit. It is the most widely used activation function. Chie y implemented in hidden layers of Neural network.
⇣
In simple words, RELU learns much faster than sigmoid and Tanh function.
5). Softmax Function :- The softmax function is also a type of sigmoid function but is handy when we are trying to handle classi cation problems.
Nature :- non-linear
Uses :- Usually used when trying to handle multiple classes. The softmax function would squeeze the outputs for each class between 0 and 1 and would also divide by
the sum of the outputs.
Ouput:- The softmax function is ideally used in the output layer of the classi er where we are actually trying to attain the probabilities to de ne the class of each input.
CHOOSING THE RIGHT ACTIVATION FUNCTION
The basic rule of thumb is if you really don’t know what activation function to use, then simply use RELU as it is a general activation function and is used in most cases
these days.
If your output is for binary classi cation then, sigmoid function is very natural choice for output layer.
Foot Note :-
The activation function does the non-linear transformation to the input making it capable to learn and perform more complex tasks.
Reference :
Understanding Activation Functions in Neural Networks
Recommended Posts:
Activation Functions
Recurrent Neural Networks Explanation
⇣
Sakshi_Tiwari
Check out this Author's contributed articles.
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to [email protected].
See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you nd anything incorrect by clicking on the "Improve Article" button below.
3
0
To-do Done
No votes yet.
Please write to us at [email protected] to report any issue with the above content.
Writing code in comment? Please use ide.geeksforgeeks.org, generate link and share the link here.
1 Comment GeeksforGeeks
1 Login
LOG IN WITH
OR SIGN UP WITH DISQUS ?
Name
✉ Subscribe d Add Disqus to your siteAdd DisqusAdd 🔒 Disqus' Privacy PolicyPrivacy PolicyPrivacy