0% found this document useful (0 votes)
20 views11 pages

DL Report (Prabal)

To leverage machine learning and image recognition techniques to analyze a dataset of movies by recognizing and categorizing movie posters using a pre-trained model. This project aims to provide insights into the effectiveness of image recognition models in accurately identifying and classifying movie posters, and to evaluate the performance of these models through detailed accuracy analysis and visualization.

Uploaded by

prabaltiwar2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views11 pages

DL Report (Prabal)

To leverage machine learning and image recognition techniques to analyze a dataset of movies by recognizing and categorizing movie posters using a pre-trained model. This project aims to provide insights into the effectiveness of image recognition models in accurately identifying and classifying movie posters, and to evaluate the performance of these models through detailed accuracy analysis and visualization.

Uploaded by

prabaltiwar2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

PROJECT REPORT

On
Image & Video Recognition using
movies_dataset.csv
Submitted in partial fulfilment
Of
“TAE-1(Project Based Learning)”
Under the subject
Deep Learning
(VII Semester, B.Tech CSE)
(ACADEMIC SESSION 2024-25)

Submitted By: Evaluation


Selection of related Algorithm(3)

Name: Prabal Arvind Tiwari Topic/Algorithm Knowledge(3)


Implementation(2)
USN: CS21042 Analysis(2)
Total

S. B. JAIN INSTITUTE OF TECHNOLOGY,


MANAGEMENT & RESEARCH, NAGPUR
(AN AUTONOMOUS INSTITUTION AFFILIATED TO RASHTRASANT TUKADOJI MAHARAJ
NAGPUR UNIVERSITY, NAAC ACCREDITED WITH 'A' GRADE)
Aim: To leverage machine learning and image recognition techniques to analyze a
dataset of movies by recognizing and categorizing movie posters using a pre-trained
model. This project aims to provide insights into the effectiveness of image recognition
models in accurately identifying and classifying movie posters, and to evaluate the
performance of these models through detailed accuracy analysis and visualization.

Objectives:
• Load the Movie Dataset: Import the dataset containing movie information
along with poster URLs.
• Download Movie Posters: Download and save movie posters from the
provided URLs.
• Perform Image Recognition: Use the VGG16 pre-trained model to recognize
and label the movie posters.
• Analyse Recognition Accuracy: Compute and visualize the accuracy of the
image recognition results using accuracy metrics, classification report, and
confusion matrix.

Algorithm:

1. Load the Dataset:


• Read the movie dataset from an Excel file.
• Display the first few rows of the dataset to ensure it is loaded correctly.

2. Download Movie Posters:


• Define a function to download images from URLs and save them locally.
• Iterate through the dataset, download each poster, and save the file path in the
dataset.

3. Perform Image Recognition:


• Load the VGG16 pre-trained model.
• Define a function to preprocess the images, perform prediction, and decode the
results.
• Apply the function to each poster image and save the recognized labels in the
dataset.

4. Analyze Recognition Accuracy:

• Add dummy true labels for the purpose of demonstration (replace with actual
labels if available).
• Compute accuracy, generate a classification report, and create a confusion
matrix.
• Visualize the confusion matrix using a heatmap.

Implementation:

!pip install opencv-python tensorflow keras


import pandas as pd

# Load the dataset


dataset_path = '/content/Movie_dataset.xlsx' # Update with the actual path to your
CSV file
movies_df = pd.read_excel(dataset_path)

# Display the first few rows of the dataset


print(movies_df.head())
# Check for missing values
print(movies_df.isnull().sum())

# Drop rows with missing values (if any)


movies_df.dropna(inplace=True)

# Display the dataset after dropping missing values


print(movies_df.head())
import requests
from PIL import Image, UnidentifiedImageError
from io import BytesIO
import os
def download_image(url, save_path):
try:
response = requests.get(url)
img = Image.open(BytesIO(response.content))
img.save(save_path)
return True
except (requests.exceptions.RequestException, UnidentifiedImageError) as e:
print(f"Error downloading image from {url}: {e}")
return False

# Create a directory to save the images


os.makedirs('posters', exist_ok=True)

# Download and save poster images


for i, row in movies_df.iterrows():
poster_url = row['Poster_Link']
save_path = f'posters/{i}.jpg'
success = download_image(poster_url, save_path)
if success:
movies_df.at[i, 'Poster_Path'] = save_path
else:
movies_df.at[i, 'Poster_Path'] = None

# Display the dataframe to check which images were successfully downloaded


print(movies_df.head())
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input,
decode_predictions
from tensorflow.keras.preprocessing.image import img_to_array, load_img
import numpy as np

# Load the VGG16 model


model = VGG16(weights='imagenet')

# Function to perform image recognition


def recognize_image(img_path):
if img_path and os.path.exists(img_path): # Check if path is not None and file
exists
try:
img = load_img(img_path, target_size=(224, 224))
x = img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
return decode_predictions(preds, top=1)[0][0][1]
except Exception as e:
print(f"Error processing image {img_path}: {e}")
return None
else:
print(f"Invalid image path: {img_path}")
return None

# Apply the image recognition function to each poster


movies_df['Recognized_Label'] = movies_df['Poster_Path'].apply(recognize_image)

# Display the updated dataframe to check the results


print(movies_df.head())

# Drop the Poster_Path column to avoid saving image paths in CSV


movies_df.drop(columns=['Poster_Path'], inplace=True)
# Save the results to a new CSV file
movies_df.to_csv('recognized_movies.csv', index=False)

# Download the file


from google.colab import files
files.download('recognized_movies.csv')

import matplotlib.pyplot as plt


import seaborn as sns

# Plot the distribution of recognized labels


plt.figure(figsize=(9, 6))
sns.countplot(data=movies_df, x='Recognized_Label')
plt.title('Distribution of Recognized Labels')
plt.xticks(rotation=90)
plt.show()

# Plot the distribution of genres


plt.figure(figsize=(12, 6))
sns.countplot(data=movies_df, x='Genre')
plt.title('Distribution of Genres')
plt.xticks(rotation=90)
plt.show()

# Plot IMDB Rating vs Recognized Labels


plt.figure(figsize=(8, 6))
sns.boxplot(data=movies_df, x='Recognized_Label', y='IMDB_Rating')
plt.title('IMDB Rating vs Recognized Labels')
plt.xticks(rotation=90)
plt.show()

print(movies_df.columns)

# Function to download images


def download_image(url, output_folder, image_name):
os.makedirs(output_folder, exist_ok=True)
img_data = requests.get(url).content
with open(os.path.join(output_folder, image_name), 'wb') as handler:
handler.write(img_data)
# Example usage
output_folder = 'posters'
movies_df['image_path'] = movies_df['Series_Title'].apply(lambda x:
os.path.join(output_folder, f"{x}.jpg"))
movies_df.apply(lambda row: download_image(row['Poster_Link'], output_folder,
f"{row['Series_Title']}.jpg"), axis=1)

from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input,


decode_predictions
from tensorflow.keras.preprocessing import image
import numpy as np

# Load pre-trained VGG16 model + higher level layers


model = VGG16(weights='imagenet')

# Function to perform image recognition


def recognize_image(image_path):
img = image.load_img(image_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
return decode_predictions(preds, top=1)[0][0]

# Example usage
movies_df['recognized_objects'] = movies_df['image_path'].apply(lambda x:
recognize_image(x) if os.path.exists(x) else ("", "", 0))

# Display the dataframe with recognized objects


print(movies_df.head())

# Save the results to a new CSV file


output_path = 'movies_dataset_with_recognition.csv'
movies_df.to_csv(output_path, index=False)
print(f"Results saved to {output_path}")
Screenshot:
Analysis:
Conclusion:
In conclusion, we performed image and video recognition on a movie dataset. We
effectively loaded the dataset, downloaded movie posters, and processed them to
generate recognized labels. The model's performance was analyzed through accuracy
metrics, classification reports, and confusion matrices, demonstrating its strengths and
identifying areas for improvement. Despite challenges like the lack of true labels and
some processing errors, the project showcased the potential of leveraging machine
learning for practical image recognition tasks. Future work can focus on refining the
dataset, exploring other models, and improving error handling for enhanced results.

Signature of Student Signature of Course In-Charge

You might also like