Open In App

Implementing Custom Layers and Activation Functions in TensorFlow

Last Updated : 06 Aug, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

TensorFlow is a flexible deep learning framework that supports both predefined and user defined components. While standard layers and activation functions work well for many tasks some projects require more control or experimentation. In such cases custom layers and custom activation functions can be implemented to create tailored architectures, inject domain specific logic or explore novel ideas in research. TensorFlow’s tf.keras API makes this process easy by allowing developers to define new behavior using simple Python classes and functions.

Custom Layers

  • Custom layers in TensorFlow allow developers to build new types of neural network components when standard layers like Dense or Conv2D are not sufficient.
  • By subclassing tf.keras.layers.Layer you can define your own forward logic, create trainable weights and integrate specialized computations into your models.
  • This is particularly useful for experimenting with novel architectures, applying domain specific operations or combining multiple functions within a reusable module.
  • Custom layers are flexible and easy to integrate into Sequential or functional models and important for advanced deep learning tasks and research.

Activation Functions

  • Activation functions in TensorFlow are important components of neural networks that introduce non linearity, enabling models to learn complex patterns in data.
  • While TensorFlow provides many built in activation functions like ReLU, Sigmoid and Tanh it also supports custom activations for advanced use cases.
  • A custom activation function can be created using a simple Python function or by subclassing tf.keras.layers.Layer if more control is needed.
  • These functions are applied to layer outputs and play important role in model performance specially when designing novel architectures or experimenting in research.

Implementation

Step 1: Install Necessary Libraries

  • This code imports required libraries for building a machine learning model using TensorFlow and scikit-learn. tensorflow is used for creating and training neural networks.
  • pandas and numpy handle data manipulation and numerical operations and train_test_split is used to split data into training and test sets while StandardScaler normalizes the features to improve model performance.
Python
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

Step 2: Load Dataset

  • This code loads the credit card dataset and separates features from the target.
  • It then applies StandardScaler to normalize the features for better model training.
  • Finally it splits the data into training and testing sets using an 80/20 split to evaluate the model’s performance on unseen data.
Python
df = pd.read_csv('creditcard.csv.zip') 
X = df.drop('Class', axis=1)
y = df['Class']

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

Step 3: Define Custom Activation Function

Python
def swish(x):
    return x * tf.nn.sigmoid(x)
    import matplotlib.pyplot as plt

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

x_vals = np.linspace(-10, 10, 200)
y_vals = swish(x_vals)

plt.figure(figsize=(6,4))
plt.plot(x_vals, y_vals)
plt.title("Swish Activation Function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid(True)
plt.show()

Output:

output
Activation Function

Step 4: Define Custom Layer

  • This code defines a custom dense neural network layer using TensorFlow and applies the Swish activation function.
  • The CustomDense class initializes weights and biases then computes the forward pass as swish(Wx + b).
  • A small batch from the training set is passed through this layer and the activations are visualized for each sample.
  • The resulting plot shows how the Swish function affects the neuron outputs, providing insight into activation behavior across the layer.
Python
class CustomDense(tf.keras.layers.Layer):
    def __init__(self, units):
        super(CustomDense, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='glorot_uniform',
                                 trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='zeros',
                                 trainable=True)

    def call(self, inputs):
        return swish(tf.matmul(inputs, self.w) + self.b)

sample_input = tf.convert_to_tensor(X_train[:5], dtype=tf.float32)
layer = CustomDense(4)  
layer.build(sample_input.shape)

output = layer(sample_input)

plt.figure(figsize=(6,4))
for i in range(output.shape[0]):
    plt.plot(output[i], label=f'Sample {i+1}')
plt.title("Output of Custom Layer (Swish Activation)")
plt.xlabel("Neuron Index")
plt.ylabel("Activation")
plt.legend()
plt.grid(True)
plt.show()

Output:

output
Custom Layer

Step 5: Build and Compile the Model

  • This code builds a neural network model using tf.keras.Sequential with a custom architecture.
  • It starts with an input layer matching the feature size, followed by two CustomDense layers with Swish activation, a dropout layer to prevent overfitting and ends with a sigmoid activated output layer for binary classification.
  • The model is compiled using the Adam optimizer, binary crossentropy loss and accuracy as the evaluation metric.
Python
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(X_train.shape[1],)),
    CustomDense(32),
    tf.keras.layers.Dropout(0.3),
    CustomDense(16),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

Step 6: Train the Model

  • This line trains the compiled neural network model on the training data for 20 epochs using mini batches of size 32.
  • It also reserves 10% of the training data for validation during training to monitor performance and detect overfitting.
Python
model.fit(X_train, y_train, epochs=20, batch_size=32, validation_split=0.1)

Step 7: Evaluate the Model

  • This code evaluates the trained model on the test set and prints the final accuracy.
  • model.evaluate() returns the loss and accuracy on unseen data giving a clear measure of how well the model generalizes beyond the training data.
Python
loss, accuracy = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {accuracy:.4f}")

Output:

Test Accuracy: 0.9993

You can download the Source code from here - Implementing Custom Layers and Activation Functions in TensorFlow

Applications

  1. Advanced Research and Experimentation: Allows researchers to prototype novel architectures and compare their effects on training. Useful for Kaggle competitions where slight performance gains matter.
  2. Domain Specific Modeling: Tailoring models for financial time series, medical imaging or natural language tasks with domain aware transformations. Custom layers can encode physics or biology specific constraints.
  3. AutoML: Supports testing unconventional or dynamic architectures not available in standard layers like learning hidden units with attention based activation or position aware filtering.
  4. Model Optimization: Create layers that reduce memory/compute cost. Custom activation functions like hard sigmoid help optimize mobile deployment.

Similar Reads