Experiment 5
Experiment 5
Aim:
To implement Recurrent Neural Network (RNN) for a given problem.
Objective:
To implement a basic RNN model (GRU Based) for a sequential dataset.
To try to improve the initial model by performing hyperparameter tuning.
Theory:
Recurrent Neural Networks (RNNs) are a class of neural networks designed for processing
sequential data. Unlike traditional feedforward networks, RNNs maintain a hidden state that
captures information from previous inputs, enabling them to effectively model temporal
dependencies. This characteristic makes RNNs particularly useful for tasks such as language
modeling, speech recognition, and time series prediction. However, standard RNNs can
struggle with long-range dependencies due to issues like vanishing and exploding gradients,
which can hinder their ability to learn effectively over extended sequences.
To address some of the limitations of basic RNNs, more advanced architectures have been
developed, including Long Short-Term Memory (LSTM) networks and Gated Recurrent Units
(GRUs). Both LSTMs and GRUs incorporate gating mechanisms that help control the flow of
information, allowing them to retain relevant information for longer periods while
discarding irrelevant data. This makes them significantly better at capturing long-range
dependencies compared to traditional RNNs.
GRUs, in particular, offer a simpler architecture than LSTMs while maintaining their
performance advantages. A GRU combines the forget and input gates of an LSTM into a
single update gate, simplifying the structure without sacrificing its ability to learn from
sequential data. This efficiency can lead to faster training times and reduced computational
costs, making GRUs an attractive choice for many applications in natural language
processing and other sequential tasks.
Overall, RNNs, with their extensions like GRUs, represent a powerful tool for modeling
sequential data. Their ability to learn patterns over time allows them to tackle a wide range
of problems in artificial intelligence, from generating text to predicting future events. As
research continues, improvements in RNN architectures and training techniques will likely
enhance their effectiveness and applicability in various domains.
Problem Statement:
Consider the data of an airline company, consisting of a sequence of months from 1949 to
1960, each with their respective number of VIP passengers that the airline was able to
serve. Train a basic recurrent neural network model that tries to predict the number of VIP
passengers the airline may expect to server in the coming months. Further, build a complex
RNN model to see if the predictions improve.
Implementation:
Following implementation of the required task was carried out in the lab session (Link to
Notebook - RecurrentNeuralNetwork) -
Importing the required libraries and the dataset
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-
passengers.csv'
df = pd.read_csv(url, usecols=[1], engine='python')
df.columns = ['Passengers']
df
Normalizing the sequential data and creating sequence batches (consisting of data points i
to i+10, for predicting the value at i+11)
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(df)
seq_length = 10
x_data, y_data = create_sequences(scaled_data, seq_length)
split = int(len(x_data) * 0.8)
x_train, y_train = x_data[:split], y_data[:split]
x_test, y_test = x_data[split:], y_data[split:]
def plot_training_results(history):
plt.figure(figsize=(12, 6))
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()
plot_training_results(history)
Evaluating the model and printing its predictions for the coming months
test_loss = model.evaluate(x_test, y_test)
print(f"Test Loss: {test_loss}")
predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)
y_test_rescaled = scaler.inverse_transform(y_test)
plt.figure(figsize=(10, 6))
plt.plot(y_test_rescaled, label='Actual Data')
plt.plot(predictions, label='Predicted Data')
plt.title('Actual vs Predicted Airline Passengers')
plt.legend()
plt.show()
Building a more complex model in an attempt to improve predictions
model = Sequential([
GRU(units=128, return_sequences=True, input_shape=(x_train.shape[1], 1)),
Dropout(0.2),
GRU(units=64, return_sequences=True),
Dropout(0.2),
GRU(units=32),
Dropout(0.2),
Dense(units=32, activation='relu'),
Dense(units=1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
history = model.fit(x_train, y_train, epochs=300, batch_size=32,
validation_data=(x_test, y_test))