Implementing Recurrent Neural Networks in PyTorch
Last Updated :
27 Feb, 2025
Recurrent Neural Networks (RNNs) are a class of neural networks that are particularly effective for sequential data. Unlike traditional feedforward neural networks RNNs have connections that form loops allowing them to maintain a hidden state that can capture information from previous inputs. This makes them suitable for tasks such as time series prediction, natural language processing and many more task. In this article we will explore how to implement RNNs using PyTorch.
Building an RNN from Scratch in Pytorch
Setting Up the Environment
Before we start implementing the RNN we need to set up our environment. Ensure you have PyTorch installed. You can install it using pip:
pip install torch
Predicting Sequential Data
To use an RNN to predict the next value in a series of numbers we will build a basic synthetic dataset. This will help us in understanding the fundamentals of RNN operation.
Step 1: Import Libraries
First we need to import the necessary libraries.
Python
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt
Step 2: Create Synthetic Dataset
We will create a simple sine wave dataset. The goal is to predict the next value in the sine wave sequence.
Python
def generate_data(seq_length, num_samples):
X = []
y = []
for i in range(num_samples):
x = np.linspace(i * 2 * np.pi, (i + 1) * 2 * np.pi, seq_length + 1)
sine_wave = np.sin(x)
X.append(sine_wave[:-1])
y.append(sine_wave[1:])
return np.array(X), np.array(y)
seq_length = 50
num_samples = 1000
X, y = generate_data(seq_length, num_samples)
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32)
print(X.shape, y.shape)
Output:
torch.Size([1000, 50]) torch.Size([1000, 50])
Step 3: Define the RNN Model
Next we will define the RNN model.
Python
class SimpleRNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleRNN, self).__init__()
self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
h0 = torch.zeros(1, x.size(0), hidden_size).to(x.device)
out, _ = self.rnn(x, h0)
out = self.fc(out)
return out
input_size = 1
hidden_size = 20
output_size = 1
model = SimpleRNN(input_size, hidden_size, output_size)
Step 4: Train the Model
Now we will train the model using Mean Squared Error (MSE) loss and Adam optimizer.
Python
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
num_epochs = 100
for epoch in range(num_epochs):
model.train()
outputs = model(X.unsqueeze(2))
loss = criterion(outputs, y.unsqueeze(2))
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
Output:
Epoch [10/100], Loss: 0.3548
Epoch [20/100], Loss: 0.2653
Epoch [30/100], Loss: 0.1757
Epoch [40/100], Loss: 0.0921
Epoch [50/100], Loss: 0.0592
Epoch [60/100], Loss: 0.0421
Epoch [70/100], Loss: 0.0306
Epoch [80/100], Loss: 0.0222
Epoch [90/100], Loss: 0.0151
Epoch [100/100], Loss: 0.0093
Step 5: Visualize the Results
Finally, we will visualize the predictions made by the model.
Python
model.eval()
with torch.no_grad():
predictions = model(X.unsqueeze(2)).squeeze(2).numpy()
plt.figure(figsize=(10, 6))
plt.plot(y[0].numpy(), label='True')
plt.plot(predictions[0], label='Predicted')
plt.legend()
plt.show()
Output:
Predicting Sequential DataPlot shows us how well the model's predictions (orange curve) match the true values (blue curve). The closeness of the two curves suggests that the RNN model is performing well and capturing the sequential patterns in data effectively.
Now that we have worked on synthetic dataset we will now be using real world dataset for analysis.
Classifying Movie Reviews Using RNN
In this example we will use a public dataset to perform sentiment analysis on movie reviews. The goal is to classify each review as positive or negative using an RNN. Step-by-Step Implementation:
Step 1: Import Libraries
Python
import torch
import torch.nn as nn
import torch.optim as optim
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from torch.utils.data import Dataset, DataLoader
Step 2: Load and Preprocess the Dataset
We will load the IMDB movie reviews dataset directly from a URL and preprocess it.
Python
url = "https://raw.githubusercontent.com/justmarkham/DAT8/master/data/sms.tsv"
df = pd.read_csv(url, delimiter='\t', header=None, names=['label', 'text'])
def preprocess_text(text):
return text.lower().split()
df['text'] = df['text'].apply(preprocess_text)
df = df[['text', 'label']]
le = LabelEncoder()
df['label'] = le.fit_transform(df['label'])
train_data, test_data = train_test_split(df, test_size=0.2, random_state=42)
vocab = set([word for phrase in df['text'] for word in phrase])
word_to_idx = {word: idx for idx, word in enumerate(vocab, 1)}
def encode_phrase(phrase):
return [word_to_idx[word] for word in phrase]
train_data['text'] = train_data['text'].apply(encode_phrase)
test_data['text'] = test_data['text'].apply(encode_phrase)
max_length = max(df['text'].apply(len))
def pad_sequence(seq, max_length):
return seq + [0] * (max_length - len(seq))
train_data['text'] = train_data['text'].apply(lambda x: pad_sequence(x, max_length))
test_data['text'] = test_data['text'].apply(lambda x: pad_sequence(x, max_length))
Step 3: Create Dataset and Data Loader
Python
class SentimentDataset(Dataset):
def __init__(self, data):
self.texts = data['text'].values
self.labels = data['label'].values
def __len__(self):
return len(self.texts)
def __getitem__(self, idx):
text = self.texts[idx]
label = self.labels[idx]
return torch.tensor(text, dtype=torch.long), torch.tensor(label, dtype=torch.long)
train_dataset = SentimentDataset(train_data)
test_dataset = SentimentDataset(test_data)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
Step 4: Define the RNN Model
The SentimentRNN is a type of neural network designed to understand sequences of words and determine if a piece of text (like a sentence) is positive or negative. Think of it like a brain that can read and understand emotions in text.
Python
class SentimentRNN(nn.Module):
def __init__(self, vocab_size, embed_size, hidden_size, output_size):
super(SentimentRNN, self).__init__()
self.embedding = nn.Embedding(vocab_size, embed_size)
self.rnn = nn.RNN(embed_size, hidden_size, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.embedding(x)
h0 = torch.zeros(1, x.size(0), hidden_size).to(x.device)
out, _ = self.rnn(x, h0)
out = self.fc(out[:, -1, :])
return out
vocab_size = len(vocab) + 1
embed_size = 128
hidden_size = 128
output_size = 2
model = SentimentRNN(vocab_size, embed_size, hidden_size, output_size)
Step 5: Train the Model
Python
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
num_epochs = 10
for epoch in range(num_epochs):
model.train()
epoch_loss = 0
for texts, labels in train_loader:
outputs = model(texts)
loss = criterion(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
epoch_loss += loss.item()
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss / len(train_loader):.4f}')
Output:
Epoch [1/10], Loss: 0.4016
Epoch [2/10], Loss: 0.3999
Epoch [3/10], Loss: 0.4004
Epoch [4/10], Loss: 0.3954
Epoch [5/10], Loss: 0.3969
Epoch [6/10], Loss: 0.3978
Epoch [7/10], Loss: 0.3960
Epoch [8/10], Loss: 0.3959
Epoch [9/10], Loss: 0.3967
Epoch [10/10], Loss: 0.3953
Step 6: Evaluate the Model
Python
model.eval()
correct = 0
total = 0
with torch.no_grad():
for texts, labels in test_loader:
outputs = model(texts)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
accuracy = 100 * correct / total
print(f'Accuracy: {accuracy:.2f}%')
Output:
Accuracy: 86.64%
Step 7: Visualize Training Loss
Python
losses = []
for epoch in range(num_epochs):
model.train()
epoch_loss = 0
for texts, labels in train_loader:
outputs = model(texts)
loss = criterion(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
epoch_loss += loss.item()
losses.append(epoch_loss / len(train_loader))
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss / len(train_loader):.4f}')
plt.figure(figsize=(10, 6))
plt.plot(range(1, num_epochs + 1), losses, marker='o')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss')
plt.show()
Output :
Epoch [1/10], Loss: 0.3946
Epoch [2/10], Loss: 0.3990
Epoch [3/10], Loss: 0.3968
Epoch [4/10], Loss: 0.3988
Epoch [5/10], Loss: 0.3949
Epoch [6/10], Loss: 0.3983
Epoch [7/10], Loss: 0.3997
Epoch [8/10], Loss: 0.3991
Epoch [9/10], Loss: 0.3991
Epoch [10/10], Loss: 0.3956
Movie Reviews Using RNNBased on the training loss plot RNN model demonstrates good performance. Although the loss fluctuates throughout the training process it shows a steady decrease by the final epochs indicating that the model is improving over time. The consistent reduction in loss towards the end tells that the model is learning effectively and converging towards a more optimized state.
You can also make RNN model using Tenserflow and for that you can refer to this article: Training of Recurrent Neural Networks (RNN) in TensorFlow
Similar Reads
How to implement neural networks in PyTorch?
This tutorial shows how to use PyTorch to create a basic neural network for classifying handwritten digits from the MNIST dataset. Neural networks, which are central to modern AI, enable machines to learn tasks like regression, classification, and generation. With PyTorch, you'll learn how to design
5 min read
Recurrent Neural Networks in R
Recurrent Neural Networks (RNNs) are a type of neural network that is able to process sequential data, such as time series, text, or audio. This makes them well-suited for tasks such as language translation, speech recognition, and time series prediction. In this article, we will explore how to impl
5 min read
Implementing Neural Networks Using TensorFlow
Deep learning has been on the rise in this decade and its applications are so wide-ranging and amazing that it's almost hard to believe that it's been only a few years in its advancements. And at the core of deep learning lies a basic "unit" that governs its architecture, yes, It's neural networks.
8 min read
Bidirectional Recurrent Neural Network
Recurrent Neural Networks (RNNs) are type of neural networks designed to process sequential data such as speech, text and time series. Unlike feedforward neural networks that process input as fixed-length vectors RNNs can handle sequence data by maintaining a hidden state which stores information fr
6 min read
Introduction to Recurrent Neural Networks
Recurrent Neural Networks (RNNs) work a bit different from regular neural networks. In neural network the information flows in one direction from input to output. However in RNN information is fed back into the system after each step. Think of it like reading a sentence, when you're trying to predic
12 min read
Training Neural Networks using Pytorch Lightning
Introduction: PyTorch Lightning is a library that provides a high-level interface for PyTorch. Problem with PyTorch is that every time you start a project you have to rewrite those training and testing loop. PyTorch Lightning fixes the problem by not only reducing boilerplate code but also providing
7 min read
Building a Convolutional Neural Network using PyTorch
Convolutional Neural Networks (CNNs) are deep learning models used for image processing tasks. They automatically learn spatial hierarchies of features from images through convolutional, pooling and fully connected layers. In this article we'll learn how to build a CNN model using PyTorch. This incl
6 min read
Implementing Artificial Neural Network training process in Python
An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the brain. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning largely involves adju
4 min read
Create Custom Neural Network in PyTorch
PyTorch is a popular deep learning framework, empowers you to build and train powerful neural networks. But what if you need to go beyond the standard layers offered by the library? Here's where custom layers come in, allowing you to tailor the network architecture to your specific needs. This compr
5 min read
Graph Neural Networks with PyTorch
Graph Neural Networks (GNNs) represent a powerful class of machine learning models tailored for interpreting data described by graphs. This is particularly useful because many real-world structures are networks composed of interconnected elements, such as social networks, molecular structures, and c
4 min read