0% found this document useful (0 votes)
2 views

9 DL_ANN_ActivationFunctions

Activation functions are crucial for determining the output of nodes in neural networks, transforming linear inputs into linear or non-linear outputs. They must be monotonic, differentiable, and converge quickly, with common types including linear and non-linear functions like Sigmoid, Tanh, and ReLU. Each function has unique properties affecting training speed and gradient behavior, with implementations such as Softmax for multi-class classification.

Uploaded by

karunyakuraganti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

9 DL_ANN_ActivationFunctions

Activation functions are crucial for determining the output of nodes in neural networks, transforming linear inputs into linear or non-linear outputs. They must be monotonic, differentiable, and converge quickly, with common types including linear and non-linear functions like Sigmoid, Tanh, and ReLU. Each function has unique properties affecting training speed and gradient behavior, with implementations such as Softmax for multi-class classification.

Uploaded by

karunyakuraganti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Topic: Introduction To Activation Functions

HariBabu KVN
Activation Functions • Activation Functions are
applied over the linear
weighted summation of the
incoming information to a
node.
• Convert linear input signals
from perceptron to a
linear/non-linear output
signal.
• Decide whether to activate a
node or not.
Activation Functions
• Must be monotonic,
differentiable, and
quickly converging.
• Types of Activation
Functions:
• Linear
• Non-Linear
Linear Activation Functions

f(x) = ax + b
df(x)/dx = a
• Observations:
• Constant gradient
• Gradient does not
depend on the
change in the input
Non-Linear Activation Functions

• Sigmoid (Logistic)
• Hyperbolic Tangent (Tanh)
• Rectified Linear Unit (ReLU)
– Leaky ReLU
– Parametric ReLU
• Exponential Linear Unit (ELU)
Sigmoid

• Observations:
– Output: 0 to 1
– Outputs are not
zero-centered
– Can saturate and kill
(vanish) gradients
Tanh

• Observations:
– Output: -1 to 1
– Outputs are zero-centered
– Can saturate and kill
(vanish) gradients
– Gradient is more steeped
than Sigmoid resulting in
faster convergence
ReLU

• Observations:
– Greatly increases training
speed compared to Tanh and
Sigmoid
– Reduces likelihood of
vanishing gradient
– Issues: Dead nodes and
blowing up activations
Leaky ReLU

• Observations:
– Fixes dying ReLU issue
Parametric ReLU
Multi-Class Classification
Multi-Class Classification
Softmax
Softmax
Softmax
Binary Classification
Softmax Implementation

import numpy as np

def softmax(z):
'''Return the softmax output of a vector.'''
exp_z = np.exp(z)
sum = exp_z.sum()
softmax_z = np.round(exp_z/sum,3)
return softmax_z

You might also like