0% found this document useful (0 votes)
43 views8 pages

Experiment 5

Uploaded by

mohammed.ansari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views8 pages

Experiment 5

Uploaded by

mohammed.ansari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Experiment - 5

Name: Ansari Mohammed Shanouf Valijan


Class: B.E. Computer Engineering, Semester - VII
UID: 2021300004
Batch: M

Aim:
To implement Recurrent Neural Network (RNN) for a given problem.

Objective:
 To implement a basic RNN model (GRU Based) for a sequential dataset.
 To try to improve the initial model by performing hyperparameter tuning.

Theory:
Recurrent Neural Networks (RNNs) are a class of neural networks designed for processing
sequential data. Unlike traditional feedforward networks, RNNs maintain a hidden state that
captures information from previous inputs, enabling them to effectively model temporal
dependencies. This characteristic makes RNNs particularly useful for tasks such as language
modeling, speech recognition, and time series prediction. However, standard RNNs can
struggle with long-range dependencies due to issues like vanishing and exploding gradients,
which can hinder their ability to learn effectively over extended sequences.

To address some of the limitations of basic RNNs, more advanced architectures have been
developed, including Long Short-Term Memory (LSTM) networks and Gated Recurrent Units
(GRUs). Both LSTMs and GRUs incorporate gating mechanisms that help control the flow of
information, allowing them to retain relevant information for longer periods while
discarding irrelevant data. This makes them significantly better at capturing long-range
dependencies compared to traditional RNNs.

GRUs, in particular, offer a simpler architecture than LSTMs while maintaining their
performance advantages. A GRU combines the forget and input gates of an LSTM into a
single update gate, simplifying the structure without sacrificing its ability to learn from
sequential data. This efficiency can lead to faster training times and reduced computational
costs, making GRUs an attractive choice for many applications in natural language
processing and other sequential tasks.

Overall, RNNs, with their extensions like GRUs, represent a powerful tool for modeling
sequential data. Their ability to learn patterns over time allows them to tackle a wide range
of problems in artificial intelligence, from generating text to predicting future events. As
research continues, improvements in RNN architectures and training techniques will likely
enhance their effectiveness and applicability in various domains.

Problem Statement:
Consider the data of an airline company, consisting of a sequence of months from 1949 to
1960, each with their respective number of VIP passengers that the airline was able to
serve. Train a basic recurrent neural network model that tries to predict the number of VIP
passengers the airline may expect to server in the coming months. Further, build a complex
RNN model to see if the predictions improve.

Implementation:
Following implementation of the required task was carried out in the lab session (Link to
Notebook - RecurrentNeuralNetwork) -
Importing the required libraries and the dataset
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-
passengers.csv'
df = pd.read_csv(url, usecols=[1], engine='python')

df.columns = ['Passengers']

df

Normalizing the sequential data and creating sequence batches (consisting of data points i
to i+10, for predicting the value at i+11)
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(df)

def create_sequences(data, seq_length):


xs, ys = [], []
for i in range(len(data)-seq_length):
x = data[i:i+seq_length]
y = data[i+seq_length]
xs.append(x)
ys.append(y)
return np.array(xs), np.array(ys)

seq_length = 10
x_data, y_data = create_sequences(scaled_data, seq_length)
split = int(len(x_data) * 0.8)
x_train, y_train = x_data[:split], y_data[:split]
x_test, y_test = x_data[split:], y_data[split:]

print(f"Training set shape: {x_train.shape}, Test set shape: {x_test.shape}")

Building a simple GRU based RNN model and compiling it


from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GRU, Dense
model = Sequential([
GRU(units=64, input_shape=(x_train.shape[1], 1)),
Dense(units=1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()

Training the model


history = model.fit(x_train, y_train, epochs=20, batch_size=16,
validation_data=(x_test, y_test))
Viewing the training and validation losses of the model over the epochs
import matplotlib.pyplot as plt

def plot_training_results(history):
plt.figure(figsize=(12, 6))
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

plot_training_results(history)

Evaluating the model and printing its predictions for the coming months
test_loss = model.evaluate(x_test, y_test)
print(f"Test Loss: {test_loss}")

predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)
y_test_rescaled = scaler.inverse_transform(y_test)
plt.figure(figsize=(10, 6))
plt.plot(y_test_rescaled, label='Actual Data')
plt.plot(predictions, label='Predicted Data')
plt.title('Actual vs Predicted Airline Passengers')
plt.legend()
plt.show()
Building a more complex model in an attempt to improve predictions

model = Sequential([
GRU(units=128, return_sequences=True, input_shape=(x_train.shape[1], 1)),
Dropout(0.2),

GRU(units=64, return_sequences=True),
Dropout(0.2),

GRU(units=32),
Dropout(0.2),

Dense(units=32, activation='relu'),
Dense(units=1)

])

model.compile(optimizer='adam', loss='mean_squared_error')
history = model.fit(x_train, y_train, epochs=300, batch_size=32,
validation_data=(x_test, y_test))

Viewing the training history (over 300 epochs)


plot_training_results(history)
Viewing the updated predictions
predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)
y_test_rescaled = scaler.inverse_transform(y_test)
plt.figure(figsize=(10, 6))
plt.plot(y_test_rescaled, label='Actual Data')
plt.plot(predictions, label='Predicted Data')
plt.title('Actual vs Predicted Airline Passengers')
plt.legend()
plt.show()
Inferences:
 In case of the first model that was constructed, one observes a sudden decrease in
training and validation loses over the epochs, going from 0.06 and 0.09 respectively
to 0.01 and 0.02.
 However, the loses get stabilized after the 7th epoch, at the values 0.005 and 0.0165
respectively.
 The predictions obtained from this model, when compared to the true trend, shows a
similarity. However, the predictions seem too generalized to be of use.
 For the comparatively complex model that was build next, an improvement was
observed in the training and validation losses. Over 300 epochs, sudden reduction in
losses was observed initially, followed by a stabilisation in training loss at 0.002 and a
small continuous fluctuation in validation loss.
 The predictions made by this model better represents the true trend, thereby proving
an improvement in predictions on increasing model complexity.

You might also like