0% found this document useful (0 votes)
29 views2 pages

Matrix

This document contains code to generate image captions using a neural network model, collect actual and predicted captions, tokenize and pad the captions, convert them to sequences of numerical labels, and generate a confusion matrix to evaluate the model's performance at predicting captions. It loads an image, displays the real captions, generates a predicted caption, collects the actual and predicted captions, tokenizes the captions, pads them to a maximum length, converts them to labels, generates a confusion matrix, and visualizes the matrix to evaluate predictions against actual captions.

Uploaded by

Banana banna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views2 pages

Matrix

This document contains code to generate image captions using a neural network model, collect actual and predicted captions, tokenize and pad the captions, convert them to sequences of numerical labels, and generate a confusion matrix to evaluate the model's performance at predicting captions. It loads an image, displays the real captions, generates a predicted caption, collects the actual and predicted captions, tokenizes the captions, pads them to a maximum length, converts them to labels, generates a confusion matrix, and visualizes the matrix to evaluate predictions against actual captions.

Uploaded by

Banana banna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

from PIL import Image

import matplotlib.pyplot as plt


from sklearn.metrics import confusion_matrix
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

def generate_caption(image_name):
# Get the image id from the image name
image_id = image_name.split('.')[0]

# Load the image


img_path = os.path.join(BASE_DIR, "Images", image_name)
image = Image.open(img_path)

# Display real captions


captions = mapping[image_id]
print('Real Captions')
for caption in captions:
print(caption)

# Generate predicted caption


y_pred = predict_caption(model, features[image_id], tokenizer, max_length)
print('Estimated caption')
print(y_pred)

# Append the actual and predicted captions to their respective lists


actual_captions.extend(captions)
predicted_captions.extend([y_pred] * len(captions))

# Display the image


plt.imshow(image)

# Prepare the lists for actual captions and predicted captions


actual_captions = []
predicted_captions = []

# Generate captions and collect actual and predicted captions


generate_caption("599366440_a238e805cf.jpg")
generate_caption("47871819_db55ac4699.jpg")
# Add more images as needed

# Create a tokenizer
tokenizer = Tokenizer()
tokenizer.fit_on_texts(actual_captions + predicted_captions)

# Convert actual and predicted captions to sequences of numerical labels


actual_sequences = tokenizer.texts_to_sequences(actual_captions)
predicted_sequences = tokenizer.texts_to_sequences(predicted_captions)

# Convert sequences to padded sequences


actual_sequences = pad_sequences(actual_sequences, maxlen=max_length)
predicted_sequences = pad_sequences(predicted_sequences, maxlen=max_length)

# Convert sequences to labels


actual_labels = np.argmax(actual_sequences, axis=1)
predicted_labels = np.argmax(predicted_sequences, axis=1)

# Get all unique labels


all_labels = np.unique(np.concatenate((actual_labels, predicted_labels)))

# Generate the confusion matrix


cm = confusion_matrix(actual_labels, predicted_labels, labels=all_labels)

# Visualize the confusion matrix


class_names = all_labels
fig, ax = plt.subplots()
im = ax.imshow(cm, interpolation='nearest', cmap='Blues')
ax.figure.colorbar(im, ax=ax)
ax.set(xticks=np.arange(cm.shape[1]),
yticks=np.arange(cm.shape[0]),
xticklabels=class_names, yticklabels=class_names,
xlabel='Predicted Labels', ylabel='Actual Labels',
title='Confusion Matrix')
plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
rotation_mode="anchor")
plt.show()

You might also like