0% found this document useful (0 votes)
2 views

Activation Functions

The document discusses activation functions in neural networks, categorizing them into linear and non-linear types. It highlights various non-linear activation functions such as sigmoid, tanh, ReLU, and leaky ReLU, explaining their characteristics, ranges, and applications. Non-linear functions are preferred for their ability to model complex data and improve neural network performance.

Uploaded by

Aderogba Fawaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Activation Functions

The document discusses activation functions in neural networks, categorizing them into linear and non-linear types. It highlights various non-linear activation functions such as sigmoid, tanh, ReLU, and leaky ReLU, explaining their characteristics, ranges, and applications. Non-linear functions are preferred for their ability to model complex data and improve neural network performance.

Uploaded by

Aderogba Fawaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

ACTIVATION FUNCTIONS

 In biologically inspired neural networks, the activation function is usually an abstraction


representing the rate of action potential firing in the cell.

 In its simplest form, this function is binary, that is, the neuron is either firing or not.

 The activation function defines the output of a neuron in terms of the induced local field.

 The Activation Functions can be basically divided into 2 types-

 Linear Activation Function

 Non-linear Activation Functions .

LINEAR OR IDENTITY ACTIVATION FUNCTION

 As you can see, the function is a line or linear.

 Therefore, the output of the functions will not be confined between any range.

Draw graph with a cross-section touching the midpoint from left to right

 Equation : f(x) = x

 Range : (-infinity to infinity)

 It does not help with the complexity or various parameters of usual data that is fed to the
neural networks.

NON-LINEAR ACTIVATION FUNCTION

 The Nonlinear Activation Functions are the most used activation functions.

 Nonlinearity helps to makes the graph look something like this (see next slide)

 Non-linear activation function makes it easy for the model to generalize or adapt with
variety of data and to differentiate between the output

THE MAIN TERMINOLOGIES NEEDED TO UNDERSTAND FOR NONLINEAR FUNCTIONS ARE:

 Derivative or Differential: Change in y-axis w.r.t. change in x-axis.It is also known as


slope.

 Monotonic function: A function which is either entirely non-increasing or non-


decreasing.

 The Nonlinear Activation Functions are mainly divided on the basis of their range or
curves

 SIGMOID OR LOGISTIC ACTIVATION FUNCTION

 The Sigmoid Function curve looks like a S-shape.

Draw graph with s touching the top,middle and bottom (write formula (z)=1/1+e^-z)

 The main reason why we use sigmoid function is because it exists between (0 to 1).
 Therefore, it is especially used for models where we have to predict the probability as an
output.

 Since probability of anything exists only between the range of 0 and 1, sigmoid is the right
choice.

 The function is differentiable.That means, we can find the slope of the sigmoid curve at any
two points.

 The function is monotonic but function’s derivative is not.

 The logistic sigmoid function can cause a neural network to get stuck at the training time.

 The softmax function is a more generalized logistic activation function which is used for
multiclass classification.

TANH OR HYPERBOLIC TANGENT ACTIVATION FUNCTION

 tanh is also like logistic sigmoid but better.

 The range of the tanh function is from (-1 to 1).

 tanh is also sigmoidal (s - shaped)

Draw graph with s touching the top,middle and bottom(from -1 to 1) and another slightly curved line
from o through 0.5 to 1. Label S as tanh and the latter as sigmoid.

 The advantage is that the negative inputs will be mapped strongly negative and the zero
inputs will be mapped near zero in the tanh graph.

 The function is differentiable.

 The function is monotonic while its derivative is not monotonic.

 The tanh function is mainly used for classification between two classes.

 Both tanh and logistic sigmoid activation functions are used in feed-forward nets.

RELU (RECTIFIED LINEAR UNIT) ACTIVATION FUNCTION

 The ReLU is the most used activation function in the world right now.

 Since, it is used in almost all the convolutional neural networks or deep learning.

Draw graph from 0 till end R(z)=max(0,z)

 As you can see, the ReLU is half rectified (from bottom).

 f(z) is zero when z is less than zero and f(z) is equal to z when z is above or equal to zero.

 Range: [ 0 to infinity)

 The function and its derivative both are monotonic.

 But the issue is that all the negative values become zero immediately which decreases the
ability of the model to fit or train from the data properly.
 That means any negative input given to the ReLU activation function turns the value into
zero immediately in the graph, which in turns affects the resulting graph by not mapping the
negative values appropriately.

LEAKY RELU

 It is an attempt to solve the dying ReLU problem

Draw graph of relu f(y)=y then add a slight from x axisf(y)=ay to left

 The leak helps to increase the range of the ReLU function.

 Usually, the value of a is 0.01 or so.

 When a is not 0.01 then it is called Randomized ReLU.

 Therefore the range of the Leaky ReLU is (-infinity to infinity).

 Both Leaky and Randomized ReLU functions are monotonic in nature.

 Also, their derivatives also monotonic in nature.

You might also like