Implementing Recurrent Neural Networks in PyTorch
Last Updated :
21 May, 2025
Recurrent Neural Networks (RNNs) are neural networks that are particularly effective for sequential data. Unlike traditional feedforward neural networks RNNs have connections that form loops allowing them to maintain a hidden state that can capture information from previous inputs. This makes them suitable for tasks such as time series prediction, natural language processing and many more task. In this article we will explore how to implement RNNs using PyTorch.
Before we start implementing the RNN we need to set up our environment. Ensure you have PyTorch installed. You can install it using pip:
!pip install torch
Classifying Movie Reviews Using RNN
In this example we will use a public dataset to perform sentiment analysis on movie reviews. The goal is to classify each review as positive or negative using an RNN.
You can download the dataset from here.
1. Importing Libraries
We are importing:
- PyTorch (torch, torch.nn, torch.optim) for building and training neural networks.
- Pandasand NumPyfor data handling and numerical operations.
- Matplotlibfor visualization.
- Scikit-learn’s train_test_split and LabelEncoder for data splitting and label encoding.
Python
import torch
import torch.nn as nn
import torch.optim as optim
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from torch.utils.data import Dataset, DataLoader
2. Loading and Preprocessing the Dataset
- Load the dataset using pd.read_csv() and assign column names.
- Lowercase and tokenize the text using pandas string methods.
- Encode labels into numeric form with LabelEncoder().
- Split the data into training and testing sets using train_test_split().
- Create a vocabulary set from all unique words in the dataset.
- Map each unique word to a unique index.
- Define encode_and_pad() function to convert tokenized sentences into sequences of indices and pad them to the maximum sequence length.
- Process training and testing texts with encode_and_pad() to prepare data for modeling.
Python
df = pd.read_csv("/content/IMDB Dataset.csv", names=["text", "label"])
df['text'] = df['text'].str.lower().str.split()
le = LabelEncoder()
df['label'] = le.fit_transform(df['label'])
train_data, test_data = train_test_split(df, test_size=0.2, random_state=42)
vocab = {word for phrase in df['text'] for word in phrase}
word_to_idx = {word: idx for idx, word in enumerate(vocab, start=1)}
max_length = df['text'].str.len().max()
def encode_and_pad(text):
encoded = [word_to_idx[word] for word in text]
return encoded + [0] * (max_length - len(encoded))
train_data['text'] = train_data['text'].apply(encode_and_pad)
test_data['text'] = test_data['text'].apply(encode_and_pad)
3. Creating Dataset and Data Loader
- Define a custom SentimentDataset class inheriting from PyTorch’s Dataset.
- Store texts and labels from input data within the class.
- Implement __len__ method to return total number of samples.
- Implement __getitem__ method to retrieve a single sample by index, converting text and label to PyTorch tensors with correct data types.
- Create dataset instances for training and testing data.
- Wrap datasets in DataLoaders with a batch size of 32.
- Shuffle training data in DataLoader for randomness, keep test data ordered.
- Prepare data for efficient batch loading during model training and evaluation.
Python
class SentimentDataset(Dataset):
def __init__(self, data):
self.texts = data['text'].values
self.labels = data['label'].values
def __len__(self):
return len(self.texts)
def __getitem__(self, idx):
text = self.texts[idx]
label = self.labels[idx]
return torch.tensor(text, dtype=torch.long), torch.tensor(label, dtype=torch.long)
train_dataset = SentimentDataset(train_data)
test_dataset = SentimentDataset(test_data)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
4. Defining the RNN Model
- Define a SentimentRNN class inheriting from PyTorch’s nn.Module.
- Initialize an embedding layer to convert word indices into dense vectors.
- Add an RNN layer to process the input sequences.
- Include a fully connected layer to map RNN outputs to the final output size.
- In the forward method pass input sequences through the embedding layer.
- Create an initial hidden state of zeros and process the sequence using the RNN layer.
- Take the output from the last time step and pass it through the fully connected layer to produce predictions.
- Set parameters of vocabulary size, embedding size, hidden size and output size.
- Start the SentimentRNN model with these parameters.
Python
class SentimentRNN(nn.Module):
def __init__(self, vocab_size, embed_size, hidden_size, output_size):
super(SentimentRNN, self).__init__()
self.embedding = nn.Embedding(vocab_size, embed_size)
self.rnn = nn.RNN(embed_size, hidden_size, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.embedding(x)
h0 = torch.zeros(1, x.size(0), hidden_size).to(x.device)
out, _ = self.rnn(x, h0)
out = self.fc(out[:, -1, :])
return out
vocab_size = len(vocab) + 1
embed_size = 128
hidden_size = 128
output_size = 2
model = SentimentRNN(vocab_size, embed_size, hidden_size, output_size)
5. Training the Model
- Define the loss function as cross-entropy loss.
- Set up the Adam optimizer with a learning rate of 0.001.
- Specify the number of training epochs.
- For each epoch set the model to training mode.
- Initialize epoch loss to zero.
- For each batch of texts and labels from the training loader: compute model outputs, calculate the loss and zero the optimizer gradients.
- Perform backpropagation by computing gradients and update model weights with the optimizer and accumulate the batch loss into epoch loss.
Python
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
num_epochs = 10
for epoch in range(num_epochs):
model.train()
epoch_loss = 0
for texts, labels in train_loader:
outputs = model(texts)
loss = criterion(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
epoch_loss += loss.item()
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss / len(train_loader):.4f}')
Output:
Training the model6. Evaluating the Model
- Set the model to evaluation mode.
- Initialize counters for correct predictions and total samples.
- Use torch.no_grad() to disable gradient calculations.
- Iterate over test loader batches and compute model outputs.
- Determine predicted classes by selecting the max output score and update total samples count.
- Increment correct count for matching predictions and true labels.
Python
model.eval()
correct = 0
total = 0
with torch.no_grad():
for texts, labels in test_loader:
outputs = model(texts)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
accuracy = 100 * correct / total
print(f'Accuracy: {accuracy:.2f}%')
Output:
Accuracy: 86.64%
7. Visualizing Training Loss
Python
losses = []
for epoch in range(num_epochs):
model.train()
epoch_loss = 0
for texts, labels in train_loader:
outputs = model(texts)
loss = criterion(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
epoch_loss += loss.item()
losses.append(epoch_loss / len(train_loader))
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss / len(train_loader):.4f}')
plt.figure(figsize=(10, 6))
plt.plot(range(1, num_epochs + 1), losses, marker='o')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss')
plt.show()
Output :
Training LossPlotting the same :
Movie Reviews Using RNNBased on the training loss plot RNN model demonstrates good performance. Although the loss fluctuates throughout the training process it shows a steady decrease by the final epochs indicating that the model is improving over time. The consistent reduction in loss towards the end tells that the model is learning effectively and converging towards a more optimized state.
You can also make RNN model using Tenserflow and for that you can refer to this article: Training of Recurrent Neural Networks (RNN) in TensorFlow
Similar Reads
How to implement neural networks in PyTorch? This tutorial shows how to use PyTorch to create a basic neural network for classifying handwritten digits from the MNIST dataset. Neural networks, which are central to modern AI, enable machines to learn tasks like regression, classification, and generation. With PyTorch, you'll learn how to design
5 min read
Training Neural Networks using Pytorch Lightning Introduction: PyTorch Lightning is a library that provides a high-level interface for PyTorch. Problem with PyTorch is that every time you start a project you have to rewrite those training and testing loop. PyTorch Lightning fixes the problem by not only reducing boilerplate code but also providing
7 min read
Building a Convolutional Neural Network using PyTorch Convolutional Neural Networks (CNNs) are deep learning models used for image processing tasks. They automatically learn spatial hierarchies of features from images through convolutional, pooling and fully connected layers. In this article, we'll learn how to build a CNN model using PyTorch which inc
3 min read
Create Custom Neural Network in PyTorch PyTorch is a popular deep learning framework, empowers you to build and train powerful neural networks. But what if you need to go beyond the standard layers offered by the library? Here's where custom layers come in, allowing you to tailor the network architecture to your specific needs. This compr
5 min read
Graph Neural Networks with PyTorch Graph Neural Networks (GNNs) represent a powerful class of machine learning models tailored for interpreting data described by graphs. This is particularly useful because many real-world structures are networks composed of interconnected elements, such as social networks, molecular structures, and c
4 min read