Building A Tanh Activation Function
Building A Tanh Activation Function
Table of Contents
1. Introduction to Tanh Activation Function
a. What are Activation Functions?
b. Why are Activation Functions Used?
c. Types of Activation Functions:
d. Choosing the Right Activation Function:
2. What is Tanh Activation Function?
3. Comparison of ReLU, Leaky ReLU, Sigmoid and Tanh Activation Functions
4. The Structure of an Tanh Activation Function
5. Implementation Tanh Activation Function from Scratch in Python
a. Step 1: Import Necessary Libraries
b. Step 2: Define the Tanh Function
c. Step 3: Define the Tanh Function Derivative
d. Step 4: Plot the Tanh Function and Its Derivative
6. Conclusion
2 ANSHUMAN JHA
Building Tanh Activation Function from Scratch in Python
1. Introduction to Tanh Activation Function
Activation functions are a crucial component of artificial neural networks, playing a key role in enabling the
neural networks to learn complex patterns and make intelligent decisions. The Tanh (Hyperbolic Tangent)
activation function is commonly used in neural networks to introduce non-linearity into the model. It maps
input values to a range between -1 and 1.
This post will guide you through implementing the Tanh activation function from scratch in Python, exploring
the fundamental concepts and steps involved, and providing sample code with detailed explanations and
visualizations.
In simple terms, an activation function in a neural network decides whether a neuron should be "activated" or
not, based on the input it receives. Imagine a neuron as a light bulb – the activation function determines if the
bulb should light up or stay off.
1. Introducing Non-linearity: Without activation functions, neural networks would essentially be performing
just linear transformations. Real-world data often exhibits non-linear relationships, meaning a straight line can't
accurately represent the patterns. Activation functions introduce non-linearity, allowing the network to model
and learn these complex relationships.
2. Decision Boundary Creation: Activation functions help neural networks create decision boundaries. For
example, in image classification, an activation function can help the network decide whether a picture is of a cat
or a dog by drawing a boundary between the two categories based on learned features.
3. Controlling Neuron Output: Activation functions control the output of a neuron, keeping it within a desired
range. This is important for stability and efficiency during training.
There are various types of activation functions, each with its own characteristics and applications, including:
* Tanh: Outputs a value between 0 and 1, historically used for binary classification.
* ReLU (Rectified Linear Unit): Outputs the input directly if positive, otherwise 0. Very popular due to its
computational efficiency.
* Tanh (Hyperbolic Tangent): Similar to Tanh but outputs values between -1 and 1.
* Softmax: Used in the output layer for multi-class classification, providing probabilities for each class.
The choice of activation function depends on the specific task, network architecture, and other factors.
Experimentation and research are often needed to find the most suitable one for a particular problem.
In essence, activation functions are the "brain" behind a neural network's decision-making process, enabling it
to learn intricate patterns and make accurate predictions.
3 ANSHUMAN JHA
Building Tanh Activation Function from Scratch in Python
2. What is Tanh Activation Function?
The Tanh activation function, short for "hyperbolic tangent," is a popular choice in neural networks. It takes an
input (representing the weighted sum of inputs to a neuron) and squashes it to an output value between -1 and 1.
where 'x' is the input value, and 'exp(x)' is the exponential function (e to the power of x).
Key Characteristics:
Output Range: The output of the Tanh function ranges from -1 to 1, making it zero-centered (unlike Sigmoid,
which ranges from 0 to 1).
Smoothness and Differentiability: Like Sigmoid, Tanh is smooth and differentiable, which is essential for
gradient-based optimization algorithms used to train neural networks.
Non-Linearity: Tanh introduces non-linearity to the model, allowing neural networks to learn complex
patterns and relationships.
Advantages of Tanh:
Zero-Centered Output: The zero-centered property of Tanh often helps in faster convergence during training
compared to Sigmoid. This is because the gradients are less likely to get stuck in one direction.
Handles Negative Inputs Better: Unlike ReLU, which "dies" for negative inputs, Tanh gracefully handles
them, providing non-zero gradients.
Disadvantages:
Vanishing Gradients: Similar to Sigmoid, Tanh can also suffer from the vanishing gradient problem for very
large or very small input values. This can slow down training, especially in deep networks.
Computational Cost: Tanh is computationally more expensive than ReLU and its variants due to the use of
exponential functions in its calculation.
Hidden Layers: Tanh is frequently used in the hidden layers of neural networks, especially in recurrent neural
networks (RNNs).
Tasks Requiring Zero-Centered Outputs: Its zero-centered output makes it suitable for tasks where having
activations centered around zero is beneficial.
Tanh is a versatile activation function that addresses some of the limitations of Sigmoid while introducing its
own trade-offs. Its zero-centered output and ability to handle negative inputs make it a valuable tool in the deep
learning toolbox.
4 ANSHUMAN JHA
Building Tanh Activation Function from Scratch in Python
3. Comparison of ReLU, Leaky ReLU, Sigmoid and Tanh Activation Functions
5 ANSHUMAN JHA
Building Tanh Activation Function from Scratch in Python
4. The Structure of a Tanh Activation Function
This Structure includes the steps and sub-steps with appropriate labels and connections. Each step corresponds to a
function or a key part of the process described in the provided implementation.
6 ANSHUMAN JHA
Building Tanh Activation Function from Scratch in Python
5. Implementation in Python
Let's implement a simple Tanh Activation Functionin Python.
Parameters:
x (array-like): Input values.
Returns:
array-like: Tanh of the input values.
"""
return np.tanh(x)
def tanh_derivative(x):
"""
Compute the derivative of the hyperbolic tangent of x.
Parameters:
x (array-like): Input values.
Returns:
array-like: Derivative of the tanh of the input values.
"""
return 1 - np.tanh(x) ** 2
7 ANSHUMAN JHA
Building Tanh Activation Function from Scratch in Python
Step 4: Visualization
To understand the behavior of the Tanh function and its derivative, let's plot them.
plt.tight_layout()
plt.show()
8 ANSHUMAN JHA
Building Tanh Activation Function from Scratch in Python
6. Conclusion
In this post, we have implemented the Tanh activation function and its derivative from scratch in Python. We
also visualized the function and its derivative to understand their behaviors. This implementation provides a
fundamental understanding of how the Tanh activation function works and can be used as a building block in
developing neural networks.
Understanding and implementing activation functions like the Tanh from scratch helps in building a strong
foundation in machine learning and deep learning.
9 ANSHUMAN JHA