Automatic Image Captioning Bot With CNN and RNN: - Submitted By-Harkirat Singh CSE-3 01976802717

This document discusses automatic image captioning using convolutional neural networks (CNNs) and recurrent neural networks (RNNs). It describes using CNNs like InceptionV3 to extract feature vectors from images as input. Common datasets for image captioning include COCO, Flickr 8K, and Flickr 30K which provide images paired with descriptions. The document outlines preprocessing steps like converting images to fixed-length vectors, building a vocabulary from captions, and preparing the data to serve as model input and output.

Uploaded by

Harkirat Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

178 views10 pages

Automatic Image Captioning Bot With CNN and RNN: - Submitted By-Harkirat Singh CSE-3 01976802717

Uploaded by

Harkirat Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Automatic Image Captioning

Bot with CNN and RNN

-Submitted By-Harkirat Singh
CSE-3
01976802717
INTRODUCTION
• Image Captioning is the process of generating textual
description of an image. It uses both Natural Language
Processing and Computer Vision to generate the captions.

• The dataset will be in the form [image → captions]. The

dataset consists of input images and their corresponding
output captions.
Image Captioning

Datasets
•Common Objects in Context (COCO). A collection of more than 120 thousand images
with descriptions
•Flickr 8K. A collection of 8 thousand described images taken from flickr.com.
•Flickr 30K. A collection of 30 thousand described images taken from flickr.com.
•Exploring Image Captioning Datasets, 2016
Data Collection
 There are many open source datasets available for
this problem, like Flickr 8k (containing8k images),
Flickr 30k (containing 30k images), MS COCO
(containing 180k images), etc.
 But for the purpose of this case study, I have used
the Flickr 8k dataset which you can download by
filling this form provided by the University of Illinois
at Urbana-Champaign. Also training a model with
large number of images may not be feasible on a
system which is not a very high end PC/Laptop.
 This dataset contains 8000 images each with 5 A white dog in a grassy area
captions (as we have already seen in the (Image Captioning )
Introduction section that an image can have
multiple captions, all being relevant
simultaneously).
These images are bifurcated as follows:
•Training Set — 6000 images
•Dev Set — 1000 images
Data Preprocessing — Images
 Images are nothing but input (X) to our model.
As you may already know that any input to a
model must be given in the form of a vector.
 We need to convert every image into a fixed sized
vector which can then be fed as input to the
neural network. For this purpose, we opt
for transfer learning by using the InceptionV3
model (Convolutional Neural Network) created
by Google Research.
 This model was trained on Imagenet dataset to
perform image classification on 1000 different
classes of images. However, our purpose here is
not to classify the image but just get fixed-length
informative vector for each image. This process is
called automatic feature engineering.
 Hence, we just remove the last softmax layer
from the model and extract a 2048 length vector
(bottleneck features) for every image as given.
Data Preparation
 This is one of the most important
steps in this case study. Here we
will understand how to prepare the
data in a manner which will be (Train image 1) Caption -> The black cat sat on grass
convenient to be given as input to
the deep learning model.

 Hereafter, I will try to explain the

remaining steps by taking a sample
example as follows.

 Consider we have 2 images and (Train image 2) Caption -> The white cat is walking on road
their 2 corresponding captions as
given.
 First we need to convert both the images to their
corresponding 2048 length feature vector as discussed
above. Let “Image_1” and “Image_2” be the feature
vectors of the first two images respectively
 Secondly, let’s build the vocabulary for the first two (train)
captions by adding the two tokens “startseq” and “endseq”
in both of them: (Assume we have already performed the
basic cleaning steps)
 Caption_1 -> “startseq the black cat sat on grass endseq”
 Caption_2 -> “startseq the white cat is walking on road
endseq”
THANK YOU!

HR Datas
No ratings yet
HR Datas
10 pages
Image Caption Generator Report
No ratings yet
Image Caption Generator Report
27 pages
Experiment-1: : Introduction To JAVA
No ratings yet
Experiment-1: : Introduction To JAVA
34 pages
Deep Learning Lab Manual
100% (10)
Deep Learning Lab Manual
30 pages
IC 38 Hajmola
No ratings yet
IC 38 Hajmola
8 pages
Building A Voice Based Image Caption Generator With Deep Learning
No ratings yet
Building A Voice Based Image Caption Generator With Deep Learning
6 pages
Workshop Proposal
No ratings yet
Workshop Proposal
20 pages
Chapter - 06 Legal and Ethical Issues
No ratings yet
Chapter - 06 Legal and Ethical Issues
23 pages
Image Caption Generation Using Deep Learning: Department of Electronics & Instrumentation Engineering NIT Silchar, Assam
No ratings yet
Image Caption Generation Using Deep Learning: Department of Electronics & Instrumentation Engineering NIT Silchar, Assam
21 pages
MGP 2025 Test Code 813215 Sol Eng
No ratings yet
MGP 2025 Test Code 813215 Sol Eng
12 pages
Internship Report (Sanjay Final)
No ratings yet
Internship Report (Sanjay Final)
45 pages
Practical 3
No ratings yet
Practical 3
4 pages
15 Report PDF
No ratings yet
15 Report PDF
35 pages
MP Notes Full Sem
No ratings yet
MP Notes Full Sem
288 pages
DLL - FP Wk8 Day 1
No ratings yet
DLL - FP Wk8 Day 1
5 pages
Image Caption
No ratings yet
Image Caption
16 pages
Java Assignment-4 UNIT-4 Input/Output Stream
No ratings yet
Java Assignment-4 UNIT-4 Input/Output Stream
27 pages
Heartofcoaching Sample
100% (1)
Heartofcoaching Sample
19 pages
Summit Evolution™: World-Class Digital Photogrammetric Workstation
No ratings yet
Summit Evolution™: World-Class Digital Photogrammetric Workstation
2 pages
Image Caption Generator PCL
No ratings yet
Image Caption Generator PCL
19 pages
BM135 Commercial LAW SLIDES
No ratings yet
BM135 Commercial LAW SLIDES
79 pages
Sample Project doc-REC
No ratings yet
Sample Project doc-REC
66 pages
Automated Image Captioning With Convnets and Recurrent Nets: Andrej Karpathy, Fei-Fei Li
No ratings yet
Automated Image Captioning With Convnets and Recurrent Nets: Andrej Karpathy, Fei-Fei Li
105 pages
Review 3
No ratings yet
Review 3
18 pages
Aust Cse Thesis Final Book
No ratings yet
Aust Cse Thesis Final Book
72 pages
Natural Language Processing-Section
No ratings yet
Natural Language Processing-Section
29 pages
1 s2.0 S1755581723001256 Main
No ratings yet
1 s2.0 S1755581723001256 Main
41 pages
Review 3
No ratings yet
Review 3
18 pages
Lecture 14 Biosynthesis and Degradation of Nucleic Acids
No ratings yet
Lecture 14 Biosynthesis and Degradation of Nucleic Acids
17 pages
Image Captioning With Visual Attention PDF
No ratings yet
Image Captioning With Visual Attention PDF
16 pages
Operating System Lab ETCS 352: Maharaja Surajmal Institute of Technology, C-4 Janak Puri, New Delhi 110058
No ratings yet
Operating System Lab ETCS 352: Maharaja Surajmal Institute of Technology, C-4 Janak Puri, New Delhi 110058
59 pages
Ai Image Captioning
No ratings yet
Ai Image Captioning
10 pages
FORGING
No ratings yet
FORGING
42 pages
Chapter 2 Architectural Models
No ratings yet
Chapter 2 Architectural Models
44 pages
Final Project Report
No ratings yet
Final Project Report
18 pages
Presentation Manu Niha
No ratings yet
Presentation Manu Niha
11 pages
Cat Dog Classification Report
No ratings yet
Cat Dog Classification Report
11 pages
Slides P71 Caption Generation
No ratings yet
Slides P71 Caption Generation
15 pages
Improved - FCC - Cat - Dog - Ipynb - Colab
No ratings yet
Improved - FCC - Cat - Dog - Ipynb - Colab
12 pages
DL Project Report
No ratings yet
DL Project Report
10 pages
Review 2
No ratings yet
Review 2
34 pages
19L038 - Deep Learning - Assignment Presentation
No ratings yet
19L038 - Deep Learning - Assignment Presentation
24 pages
Imagecaptionusing CNNand LSTM
No ratings yet
Imagecaptionusing CNNand LSTM
11 pages
REST0001 - Week 5 Sensitivity Analysis Practice Questions - Solution
No ratings yet
REST0001 - Week 5 Sensitivity Analysis Practice Questions - Solution
19 pages
U2543617 Animal Classification
No ratings yet
U2543617 Animal Classification
20 pages
Implementing Complexity in Automatic Image Caption Generator Using Recurrent Neural Network Over Long Short-Term Memory
No ratings yet
Implementing Complexity in Automatic Image Caption Generator Using Recurrent Neural Network Over Long Short-Term Memory
8 pages
Design Basis: CE 315-Design of Concrete Structure - I Instructor: Dr. E. R. Latifee
No ratings yet
Design Basis: CE 315-Design of Concrete Structure - I Instructor: Dr. E. R. Latifee
2 pages
Geographical Investigations
No ratings yet
Geographical Investigations
10 pages
Minor
No ratings yet
Minor
14 pages
Implementation of Simple and Efficient P
No ratings yet
Implementation of Simple and Efficient P
8 pages
BTP Report
No ratings yet
BTP Report
27 pages
MX SB RO: User Manual
No ratings yet
MX SB RO: User Manual
23 pages
VCAS 2022 Paper 632
No ratings yet
VCAS 2022 Paper 632
9 pages
RP Springer
No ratings yet
RP Springer
10 pages
Image Caption Generation
No ratings yet
Image Caption Generation
8 pages
Image Captioning
No ratings yet
Image Captioning
33 pages
2019 Shark Fishing and Finning Regulations
No ratings yet
2019 Shark Fishing and Finning Regulations
9 pages
Pre-Trained Models: Objectives
No ratings yet
Pre-Trained Models: Objectives
12 pages
Document From Deependra Singh
No ratings yet
Document From Deependra Singh
10 pages
(Slideshare Downloader La) 63f3b3656bdc1
No ratings yet
(Slideshare Downloader La) 63f3b3656bdc1
36 pages
Apply Deep Learning-Based CNN and LSTM For Visual Image Caption Generator
No ratings yet
Apply Deep Learning-Based CNN and LSTM For Visual Image Caption Generator
6 pages
Image Caption Generator by Using CNN and LSTM: International Journal For Multidisciplinary Research
No ratings yet
Image Caption Generator by Using CNN and LSTM: International Journal For Multidisciplinary Research
6 pages
Project Review
No ratings yet
Project Review
12 pages
ALGORITHM Saikareddy Img Cap-1742112866980
No ratings yet
ALGORITHM Saikareddy Img Cap-1742112866980
6 pages
Image Caption Generator Using Deep Learning
No ratings yet
Image Caption Generator Using Deep Learning
9 pages
Presumption of Constitutionality
No ratings yet
Presumption of Constitutionality
17 pages
Experiment 8 Fuentes Mark
No ratings yet
Experiment 8 Fuentes Mark
29 pages
Ndoro and Another V Conjugal Enterprises (Private) Limited and Another (814 of 2022) 2022 ZWHHC 814 (16 November 2022)
No ratings yet
Ndoro and Another V Conjugal Enterprises (Private) Limited and Another (814 of 2022) 2022 ZWHHC 814 (16 November 2022)
7 pages
Information Security Lab File: Name - Harkirat Singh Class - CSE - 3 Enrollment No - 01976802717
No ratings yet
Information Security Lab File: Name - Harkirat Singh Class - CSE - 3 Enrollment No - 01976802717
25 pages
Natural Language Processing
No ratings yet
Natural Language Processing
4 pages
Cats and Dogs Classification
No ratings yet
Cats and Dogs Classification
12 pages
Role of Principal
No ratings yet
Role of Principal
3 pages
Image Captioning Research Paper
No ratings yet
Image Captioning Research Paper
59 pages
Image Captioning
No ratings yet
Image Captioning
17 pages
ROHAN PRASAD FinalProjectReport - Rohan Gamer
No ratings yet
ROHAN PRASAD FinalProjectReport - Rohan Gamer
39 pages
Michael's Resume 2024
No ratings yet
Michael's Resume 2024
3 pages
Air Pollution: By-Harkirat Singh
No ratings yet
Air Pollution: By-Harkirat Singh
10 pages
Air Pollution: By-Harkirat Singh
No ratings yet
Air Pollution: By-Harkirat Singh
10 pages
TC4033 FinalQuiz 33
No ratings yet
TC4033 FinalQuiz 33
5 pages
Automated Image Captioning Using CNN and RNN
No ratings yet
Automated Image Captioning Using CNN and RNN
17 pages
Lab Assignment 2
No ratings yet
Lab Assignment 2
7 pages
Motion in A Plane: Made By: Gurkirat Singh
No ratings yet
Motion in A Plane: Made By: Gurkirat Singh
12 pages
Cad and Dog
No ratings yet
Cad and Dog
5 pages
Automatic Image Captioning Using Neural Networks
No ratings yet
Automatic Image Captioning Using Neural Networks
9 pages
DL 20i0551 Project Proposal
No ratings yet
DL 20i0551 Project Proposal
3 pages
Automated Neural Image Caption Generator For Visually Impaired People
No ratings yet
Automated Neural Image Caption Generator For Visually Impaired People
6 pages
Cad and Dog 2
No ratings yet
Cad and Dog 2
5 pages
1MWh ESS and 303KW PV System For Biova V2
No ratings yet
1MWh ESS and 303KW PV System For Biova V2
1 page
Image Caption Generator
No ratings yet
Image Caption Generator
2 pages
Image Captioning Using CNN and LSTM
No ratings yet
Image Captioning Using CNN and LSTM
9 pages
JSR-211 - Devx
No ratings yet
JSR-211 - Devx
6 pages
Exercise 2 Building Convolution Neural Network
No ratings yet
Exercise 2 Building Convolution Neural Network
15 pages
Smartphone As A Tool For Different Applications
No ratings yet
Smartphone As A Tool For Different Applications
4 pages
Ajsr 50 08
No ratings yet
Ajsr 50 08
14 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Information Science: Competency Levels of Nursing Informatics
No ratings yet
Information Science: Competency Levels of Nursing Informatics
6 pages
Web Engineering 6 SEM. 2020
No ratings yet
Web Engineering 6 SEM. 2020
1 page
20.-Mclaughlin V CA
No ratings yet
20.-Mclaughlin V CA
2 pages
Conbextra AT
No ratings yet
Conbextra AT
2 pages
Compressor Handbook 2
No ratings yet
Compressor Handbook 2
7 pages
Module 1: Short Questions
No ratings yet
Module 1: Short Questions
1 page
Pyqt6 101: A Beginner’s Guide to PyQt6
From Everand
Pyqt6 101: A Beginner’s Guide to PyQt6
Edward Chang
No ratings yet
Python and SQLite Development
From Everand
Python and SQLite Development
Agus Kurniawan
No ratings yet