Open In App

Text Generation using Gated Recurrent Unit Networks - ML

Last Updated : 12 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Gated Recurrent Unit (GRU) are a type of Recurrent Neural Network (RNN) that are designed to handle sequential data such as text by using gating mechanisms to regulate the flow of information. Unlike traditional RNNs which suffer from vanishing gradient problems, GRUs offer a more efficient way to capture long-range dependencies in sequences. In this article, we will learn to build a Text Generator using a GRU network to generate creative text based on the learned patterns.

1. Importing the required libraries

We will import the following libraries :

  • pandas: For easy data loading and handling.
  • numpy: For numerical operations.
  • tensorflow: To build and train the GRU model.
  • random: For generating random starting points in text.
Python
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GRU, Dense, Activation
from tensorflow.keras.optimizers import RMSprop
import random

2. Loading the data into a string

Here we are using a dataset of poems to train our GRU model. You can download dataset from here. We load the text lines into a pandas Data Frame, join all lines into one string and preview the first 500 characters.

Python
df = pd.read_csv('/content/poems.csv', header=None, names=['text'])
text = " ".join(df['text'].tolist())

print(text[:500])

Output:

Through the forest deep, where shadows linger long, the night sings its song......whisper of hope in the silence

3. Creating Character Mappings

We will extract unique characters in the text and create mappings from characters to indices and back.

  • set(text): Converts the text into a set to find unique characters.
  • vocabulary: A sorted list of all unique characters in the text.
  • char_to_indices: Maps each character to a unique index.
  • indices_to_char: The reverse mapping, mapping each index back to its corresponding character.
Python
vocab = sorted(set(text))
char_to_idx = {c: i for i, c in enumerate(vocab)}
idx_to_char = {i: c for i, c in enumerate(vocab)}

4. Prepare Input and Output Sequences

We will split the text into overlapping sequences of length 100. For each sequence, the next character is the label. Then we also One-hot encode inputs and outputs.

  • max_length: Defines the length of each input sequence (100 characters).
  • steps: Defines the step size by which the sliding window moves (5 characters).
  • sentences: List of subsequences of length max_length.
  • next_chars: List of the character that follows each subsequence.
  • X and y: Arrays to hold the one-hot encoded input and output data.
  • X[i, t, char_to_indices[char]] = 1: One-hot encodes each character in the input sequence.
  • y[i, char_to_indices[next_chars[i]]] = 1: One-hot encodes the next character for the output.
Python
max_len = 100
step = 5

sentences = []
next_chars = []

for i in range(0, len(text) - max_len, step):
    sentences.append(text[i: i + max_len])
    next_chars.append(text[i + max_len])

X = np.zeros((len(sentences), max_len, len(vocab)), dtype=bool)
y = np.zeros((len(sentences), len(vocab)), dtype=bool)

for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        X[i, t, char_to_idx[char]] = 1
    y[i, char_to_idx[next_chars[i]]] = 1

5. Building the GRU network

We will create a single GRU layer with 128 units with the following:

  • GRU(128): Adds a GRU layer with 128 units which will process the input sequences and retain memory of previous inputs.
  • Dense(len(vocabulary)): The output layer with a number of units equal to the size of the vocabulary where each unit corresponds to a unique character.
  • Activation('softmax'): The softmax activation function ensures the output is a probability distribution over all characters.
  • RMSprop(learning_rate=0.01): Specifies RMSprop optimizer with a learning rate of 0.01.
  • model.summary(): Displays a detailed summary of the model architecture including layer types, output shapes and number of parameters.
  • model.compile(): Compiles the model with categorical cross-entropy loss (used for multi-class classification) and the RMSprop optimizer.
Python
model = Sequential()
model.add(GRU(128, input_shape=(max_len, len(vocab))))
model.add(Dense(len(vocab)))
model.add(Activation('softmax'))

optimizer = RMSprop(learning_rate=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

model.summary()

Output:

modelarch
Building the GRU network

6. Training the GRU model

The model.fit() function trains the model on the input data (X) and target labels (y) for 30 epochs with a batch size of 128.

Python
model.fit(X, y, batch_size=128, epochs=30)

Output:

training-GRU
Training the GRU model

7. Defining Text Generation Function

We define sample and generation functions where :

  • sample(preds, temperature): Adjusts model output probabilities with temperature to control randomness, then samples the next character index from the distribution.
  • generate_text(length, temperature): Starts with a random seed sequence, repeatedly predicts the next character using the model and sample(), updates the input sequence and builds generated text of the specified length.
Python
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds + 1e-8) / temperature  # add epsilon to avoid log(0)
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

def generate_text(length=400, temperature=0.5):
    start_idx = random.randint(0, len(text) - max_len - 1)
    generated = ''
    sentence = text[start_idx: start_idx + max_len]
    generated += sentence

    for _ in range(length):
        x_pred = np.zeros((1, max_len, len(vocab)))
        for t, char in enumerate(sentence):
            x_pred[0, t, char_to_idx[char]] = 1
        
        preds = model.predict(x_pred, verbose=0)[0]
        next_idx = sample(preds, temperature)
        next_char = idx_to_char[next_idx]

        generated += next_char
        sentence = sentence[1:] + next_char

    return generated

8. Generate Sample Text

We can generate same text using our trained model now.

Python
print(generate_text(length=500, temperature=0.3))

, dreams take flight on wings of stars. Love ...........proud and wise. Th

Here we can see model is working fine and now can be used for generating text using GRU.

You can download source code from here.


Similar Reads