0% found this document useful (0 votes)

147 views

Automatic Image Caption Generation System

Computer vision has become omnipresent in our society, with uses in several fields. In this project, we specialize in one among the visually imparting recognition of images in computer vision, that is image captioning

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

147 views

Automatic Image Caption Generation System

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Volume 6, Issue 6, June-2021 International Journal of Innovative Science and Research Technology

ISSN No: 2456-2156

Automatic Image Caption Generation System

Satyabrat Mandal: Nachiket Lele:
Student in Department of Information Technology, Student in Department of Information Technology,
Smt. Kashibai Navale College Of Engineering, Smt. Kashibai Navale College Of Engineering,
Savitribai Phule Pune University, Savitribai Phule Pune University,
Ambegaon, Pune, Maharashtra, India. Ambegaon, Pune, Maharashtra, India.

Chinmay Kunawar:
Student in Department of Information Technology,
Smt. Kashibai Navale College Of Engineering, Savitribai Phule Pune University,
Ambegaon, Pune, Maharashtra, India.

Abstract:- Computer vision has become omnipresent in The attempts made within the past have all been to stitch the
our society, with uses in several fields. In this project, we both two models together.
specialize in one among the visually imparting
recognition of images in computer vision, that is image In the model proposed we attempt to combine this into
captioning. The problem of generating language one model which consists of Convolutional Neural Network
descriptions for images is still considered a problem (CNN) encoder which usually creates image encodings. We
which needs a resolution and this has been studied more use the Xception architecture with some modifications.
regressively within the field of videos. From past few These encodings are then passed to a LSTM network layer
years more emphasis has been given to still images and which are a kind of Recurrent Neural Network. The
their descriptions with human understandable natural specification used for the LSTM network add similar way
language. The task of detecting scenes and object has because the ones utilized in machine translators. We then
become easier due studies that have taken place in last use Flickr8k dataset to train and coach the model. The model
few years. The main motive of our project is to train generates a caption as an output that is to be supported by
convolutional neural networks and applying various the dictionary which is formed from the tokens of caption
hyper parameters with huge datasets of images like within the training set.
Flicker 8k and Resnet, and combining the results of these
images and their classifiers with a recurrent neural and II. PROBLEM DEFINATION
obtain the desired caption for the image. In this paper we
would be presenting the detailed architecture of the Image caption generation has been considered as a
image captioning model. challenging and significant research area that is constantly
following advancements in statistical language modelling
Keywords:- Computer Vision, Convolutional Neural and image recognition system. Caption generation can
Network (CNN), Recurrent Neural Network (RNN), benefit many like helping the visually impaired by aiding
Xception, Flicker 8K, LSTM, Preprocessing. them by enabling automatic captions of the millions of
images uploaded to the internet every day which will help
I. INTRODUCTION them understand the World Wide Web.

In the past few years the field of AI namely Deep III. PROBLEM SOLUTION
Learning has developed a lot because of its impressive leads
to terms of accuracy in comparison with the already existing In our perception the main components of image
Machine learning algorithms. It might be a difficult task to captioning are CNN and RNN. And then merging them both
get a meaningful sentence from an image but if done to get the captions as follows for the images.
successfully, it can have a huge impact, as an example
helping the visually impaired to possess a better
understanding of images.

Image captioning is considered a bit more difficult in

comparison with image classification, which has been the
main focus point within the computer vision community.
The task to find the relationship between the objects in the
image is the most important factor to consider. In addition to
the visual understanding of the image, the above semantic
knowledge has got to be expressed during a tongue like
English, which suggests that a language model is required.

IJISRT21JUN776 www.ijisrt.com 1034

Volume 6, Issue 6, June-2021 International Journal of Innovative Science and Research Technology
ISSN No: 2456-2156
Convolution Neural Network (CNN)

Fig. 2: A simple neural network unrolled into simple neural

network.

Fig.1: CNN Architecture

Convolutional Neural Network (CNN) has been an Fig. 3: Four interacting layers in a LSTM layer
important factor for the improvement in image classification.
Image net Large Scale Visual Recognition competition Datasets to be used
(ILSVRC) have various open source deep learning For the task of image captioning we use Flickr8k
frameworks like ZFnet, Alexnet, Vgg16, Resnet, Xception, dataset. The dataset contains 8000 images with 5 captions
etc. which do have great ability to classify images. And for per image. The dataset by default is split into image and text
encoding our images we are using Xception in our model. folders. Each image has a unique id and the caption for each
The image used for classification needs to be a 224*224 of these images is stored corresponding to the respective id.
image. The one and only preprocessing done is by
subtracting the mean RGB values from each pixel The dataset contains 6000 training images, 1000
determined from the training images. The CNN layer of 3*3 development images and 1000 test images.
filters and the stride length is fixed at 1. Max pooling is done
using 2*2-pixel window having stride length of 2. Images
need to be converted into 224*224-dimensional image. The
output of the encoder would thus be a 1*1*4096 encoded
and which is then passed to the language generating RNN.
We do have many other frameworks which are successful in
this field like Resnet but they are very expensive
computationally since the number of layers in Resnet is very
high as compared to Xception therefore it requires a very
powerful system.

Recurrent Neural Network(RNN) Fig. 4: Sample photo with captions from the Flickr8k dataset
Recurrent neural networks are types of artificial neural
network where the connections between units are formed by Tokenizing Captions
a directed cycle. Recurrent neural can also be termed as The Recurrent Neural Network (RNN) segment is
networks with loops where the information usually persist in trained on the captions that are given in the Flicker 8K
networks. Recurrent neural network can be considered as dataset. We are supposed to train the RNN to forecast the
multiple copies of same network with each network passing succeeding word of a sentence that is inspired from the
the message to its successor. One of the problems with foregoing words. Because of t this we are supposed to alter
RNNs is that they do not take long-term dependencies into the captions linked with the images that are in the list of
account. To surpass the problem which usually occurs due to tokenize words. This can turn any string into a list of
of “long term dependencies”, Hochreiter and Schmidhuber integers.
put forward a term called the Long Short-Term Memory.
The key and the importance that backs the LSTM network is Firstly, we go through all the captions that are trained
the horizontal line that is running on the top which is known and then generate a dictionary that plots all the distinctive
as the cell state. All the repeating modules are supported by words to a numerical index format. So, each and every word
the cell states and every module is modified with the help of that we pass through will have an integer value accordingly
gates. All These things lets LSTM network to persist all the that we would be able to see in this dictionary. The words of
available information. these dictionaries are referred to as our vocabulary. It remold
each and every word into a caption and then it is converted to
a vector format that is desired to be used. After this step, we

IJISRT21JUN776 www.ijisrt.com 1035

Volume 6, Issue 6, June-2021 International Journal of Innovative Science and Research Technology
ISSN No: 2456-2156
are supposed to train the RNN that can help to figure out the V. CONCLUSION
next word in a sentence.
Image caption generation involves Convolutional
IV. RESULTS AND OBSERVATION Neural Network and Long short-term Memory to detect
objects and captioning the images. Image caption generation
We did test our system by testing around 300 images of has many advantages, we discussed a convolutional approach
different category, and we observed that for about 178 for image caption generation. Even though automatically
images we got perfect captions and these object basically generating captions for images is a complex task, with the
were having very few objects that is around one or two but help of models and powerful deep learning networks, it is
when the image object wears a multicolor shirt it couldn’t possible to obtain good results.
recognize the colors and determines the brightest color like
red as the main color like in fig 5. And this also cannot In the future scope we further can extend our project in
determine moving and still objects and also doesn’t the next higher level by modifying our model for generating
determine multiple same objects like in fig.6. captions even for the live video. Currently our model
generates captions only for the image, which itself a difficult
We did find a precision of 63% from our observation task and captioning live video is much more complex to
which was better than other dataset which was used before it. create. This is completely GPU based and captioning live
video cannot be possible with the general CPUs. Captioning
video is a popular research area in which it is going to
change the way of life of the people with the use cases being
widely usable in almost every domain. It automates the
major tasks like video surveillance and other security tasks.
Also, we can extend our work by enhancing our model to
develop a voice clip for the caption that is generated by the
system this will help the visually impaired people to get an
idea about the image.

ACKNOWLEDGEMENT

We are very grateful to all the teachers of our college

who have helped us with their valuable guidance towards the
Fig.5: snapshot of the output completion of our project entitled “Automatic Image
Caption Generation System” as this is a part of our syllabus
of Bachelor of engineering (B.E) course. We convey our
genuine regards towards our department who have helped us
with the essential guidance.

We want to convey our special thanks to

Prof.M.V.Raut for providing us with all the necessary
instructions and guidance, solving our problems, giving us
an insight on each and every step and contributing with her
knowledge and experience in making this project come true.
We are also very thankful to Prof. R. H. Borhade, Prof. L.V.
Patil for their invaluable support. The acknowledgement will
be incomplete without mentioning our Principal Prof. Dr. A.
V. Deshpande, whose constant assistance encouragement
Fig.6: snapshot of the output has been highly important in making our project.

We would like to express our gratitude towards our

parents & friends for their kind cooperation and
encouragement which help us in finishing this project.

REFERENCES

[1]. Fang, Hao, et al. "From captions to visual concepts and

back." Proceedings of the IEEE conference on
computer vision and pattern recognition. 2015.
[2]. Xu, Kelvin, et al. "Show, attend and tell: Neural image
caption generation with visual attention." International
Conference on Machine Learning. 2015.
Fig.7: snapshot of the output

IJISRT21JUN776 www.ijisrt.com 1036

Volume 6, Issue 6, June-2021 International Journal of Innovative Science and Research Technology
ISSN No: 2456-2156
[3]. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua
Bengio. 2015. Neural machine translation by jointly
learning to align and translate. In International
Conference on Learning Representations (ICLR).
[4]. Peter Anderson, Xiaodong He, Chris Buehler, Damien
Teney, Mark Johnson, Stephen Gould, and Lei Zhang.
2017. Bottom-up and top-down attention for image
captioning and vqa. arXiv preprint arXiv:1707.07998
(2017).
[5]. Socher, R., Karpathy, A., Le, Q.V., Manning, C.D., &
Ng, A.Y. (2014). Grounded Compositional Semantics
for Finding and Describing Images with Sentences.
TACL, 2, 207-218.

IJISRT21JUN776 www.ijisrt.com 1037

Real-Time Rendering 4th Edition Tomas Akenine-MöLler - Instantly access the full ebook content in just a few seconds
100% (2)
Real-Time Rendering 4th Edition Tomas Akenine-MöLler - Instantly access the full ebook content in just a few seconds
70 pages
Trackpad Pro Ver. 5.0 Class 6
From Everand
Trackpad Pro Ver. 5.0 Class 6
Nidhi Arora
No ratings yet
2018 Too, A Comparative Study of Fine-Tuning Deep Learning Models For Plant Disease PDF
No ratings yet
2018 Too, A Comparative Study of Fine-Tuning Deep Learning Models For Plant Disease PDF
8 pages
College Documentation - Automated Image Captioning
No ratings yet
College Documentation - Automated Image Captioning
26 pages
Cta Eb CV 01174 D 2016mar03 Ass
No ratings yet
Cta Eb CV 01174 D 2016mar03 Ass
22 pages
Deep Learning - Handwritten Digit Recognition Using Python REVIEW 0
No ratings yet
Deep Learning - Handwritten Digit Recognition Using Python REVIEW 0
16 pages
Segmentation and Object Recognition Using Edge Detection Techniques
No ratings yet
Segmentation and Object Recognition Using Edge Detection Techniques
9 pages
Seminar On Deep CNN
No ratings yet
Seminar On Deep CNN
36 pages
Detection of Stroke Disease Using Machine Learning Algorithams Full
No ratings yet
Detection of Stroke Disease Using Machine Learning Algorithams Full
57 pages
Plant Leaf Disease Detection Using Machine Learning
No ratings yet
Plant Leaf Disease Detection Using Machine Learning
7 pages
Fruit Old
No ratings yet
Fruit Old
37 pages
A Project Report
No ratings yet
A Project Report
38 pages
Digital Media Marketing Using Trend Analysis On Social Media Seminar Presentation
100% (1)
Digital Media Marketing Using Trend Analysis On Social Media Seminar Presentation
16 pages
Covid 19 Prediction in India Using Machine Learning
No ratings yet
Covid 19 Prediction in India Using Machine Learning
5 pages
NLP and ML Project
100% (1)
NLP and ML Project
37 pages
Speech Recognition System
No ratings yet
Speech Recognition System
16 pages
Unit II Requirements Elicitation
No ratings yet
Unit II Requirements Elicitation
23 pages
Final Repot
No ratings yet
Final Repot
48 pages
Face Detection & Emotion Recognition
No ratings yet
Face Detection & Emotion Recognition
26 pages
Notes On COMPUTER VISION
No ratings yet
Notes On COMPUTER VISION
10 pages
What Is Deep Learning?: Artificial Intelligence Machine Learning
No ratings yet
What Is Deep Learning?: Artificial Intelligence Machine Learning
3 pages
Company Annual Sales Prediction Based On Advertisement Expenses
No ratings yet
Company Annual Sales Prediction Based On Advertisement Expenses
18 pages
Discovering Student Dropout Prediction Through Deep Learning
No ratings yet
Discovering Student Dropout Prediction Through Deep Learning
5 pages
Name of The Project: Seminar Report ON
No ratings yet
Name of The Project: Seminar Report ON
52 pages
Full Download Physically Based Rendering From Theory to Implementation 4th edition Matt Pharr PDF DOCX
100% (2)
Full Download Physically Based Rendering From Theory to Implementation 4th edition Matt Pharr PDF DOCX
50 pages
Chronic Kidney Disease Using CNN
100% (1)
Chronic Kidney Disease Using CNN
10 pages
Transfer Learning: Meskatul Islam ID: 1703210201349 6 Semester, Dept. of CSE Premier University, Chittagong
No ratings yet
Transfer Learning: Meskatul Islam ID: 1703210201349 6 Semester, Dept. of CSE Premier University, Chittagong
4 pages
Embedded Systems Iare
100% (1)
Embedded Systems Iare
137 pages
DLT Unit-1
No ratings yet
DLT Unit-1
66 pages
Customer Review Analysis Using Data Science
No ratings yet
Customer Review Analysis Using Data Science
31 pages
Sign Language Recognition Using Deep Learning
No ratings yet
Sign Language Recognition Using Deep Learning
6 pages
Stock Market Trend Prediction Using Machine Learning
No ratings yet
Stock Market Trend Prediction Using Machine Learning
18 pages
Embedded Web Server
No ratings yet
Embedded Web Server
87 pages
Transfer Learning Seminar
No ratings yet
Transfer Learning Seminar
12 pages
Machine Learning in Data Science
No ratings yet
Machine Learning in Data Science
16 pages
Nimbalkar Sandesh Seminar PPT Final
No ratings yet
Nimbalkar Sandesh Seminar PPT Final
20 pages
Machine Learning Applications Used in Accounting and Audits
100% (1)
Machine Learning Applications Used in Accounting and Audits
6 pages
PYTHON 2021-22 Projects List
No ratings yet
PYTHON 2021-22 Projects List
9 pages
Minor Project Report
No ratings yet
Minor Project Report
49 pages
Machine Learning
100% (1)
Machine Learning
46 pages
Artificial Neural Networks - MiniProject
100% (1)
Artificial Neural Networks - MiniProject
16 pages
G.R.Anantha Raman - 1A Review On Big Data Analytics in The Field of Agriculture
No ratings yet
G.R.Anantha Raman - 1A Review On Big Data Analytics in The Field of Agriculture
16 pages
Autism Spectrum Disorder Detection Using Facial Images
No ratings yet
Autism Spectrum Disorder Detection Using Facial Images
14 pages
Team14 Mini Report FINAL
No ratings yet
Team14 Mini Report FINAL
61 pages
Forest Fire Detection Using Sensor Networks
No ratings yet
Forest Fire Detection Using Sensor Networks
13 pages
Breast Cancer Classification Using Deep Learning Final Ppt (1)
No ratings yet
Breast Cancer Classification Using Deep Learning Final Ppt (1)
19 pages
Final Report
No ratings yet
Final Report
51 pages
Object Detection - Deep Learning: Jamia Hamdard
No ratings yet
Object Detection - Deep Learning: Jamia Hamdard
26 pages
Image Recognition and Its Language Translation Using OCR
No ratings yet
Image Recognition and Its Language Translation Using OCR
8 pages
715ECT04 Embedded Systems 2M & 16M
0% (1)
715ECT04 Embedded Systems 2M & 16M
32 pages
Computer Graphics Using C - Lab Record
100% (2)
Computer Graphics Using C - Lab Record
87 pages
Video Based Fight Detection Using Deep Learning
No ratings yet
Video Based Fight Detection Using Deep Learning
52 pages
Iot Based Flood Alert System Using Node Mcu
No ratings yet
Iot Based Flood Alert System Using Node Mcu
10 pages
Robotics Chapter 5 - Robot Vision
No ratings yet
Robotics Chapter 5 - Robot Vision
7 pages
Report
100% (1)
Report
32 pages
Cropthesis PDF
0% (1)
Cropthesis PDF
67 pages
Currency Detector App For Visually Impaired
No ratings yet
Currency Detector App For Visually Impaired
5 pages
Rotten Fruit Vegetable Detector Machine
No ratings yet
Rotten Fruit Vegetable Detector Machine
71 pages
Convex Hull Algorithms
No ratings yet
Convex Hull Algorithms
4 pages
Building Modern GUIs with tkinter and Python: Building user-friendly GUI applications with ease (English Edition)
From Everand
Building Modern GUIs with tkinter and Python: Building user-friendly GUI applications with ease (English Edition)
Saurabh Chandrakar
No ratings yet
Futuristic Learning: AI Edition
From Everand
Futuristic Learning: AI Edition
Tharun Vigneswar PS
No ratings yet
Investigating the Interplay between Climate Change and Sustainable Environment Development: Challenges, Strategies and Future Directions
No ratings yet
Investigating the Interplay between Climate Change and Sustainable Environment Development: Challenges, Strategies and Future Directions
11 pages
Assessment of Underground Water Quality of Gosa Landfill Site of the Federal Capital Territory, Abuja Nigeria
No ratings yet
Assessment of Underground Water Quality of Gosa Landfill Site of the Federal Capital Territory, Abuja Nigeria
11 pages
Transition to Telepsychotherapy: Experiential Perspective of Debutant Therapists
No ratings yet
Transition to Telepsychotherapy: Experiential Perspective of Debutant Therapists
6 pages
Monte Carlo-Based Modeling of 2-D Ising Systems Using Metropolis Algorithm, Simulation Techniques, Thermodynamic Behavior and Magnetization Patterns
No ratings yet
Monte Carlo-Based Modeling of 2-D Ising Systems Using Metropolis Algorithm, Simulation Techniques, Thermodynamic Behavior and Magnetization Patterns
16 pages
Crude Oil Price Volatility and its Impact on Nigeria’s Balance of Trade: An Empirical Assessment (2000–2023)
No ratings yet
Crude Oil Price Volatility and its Impact on Nigeria’s Balance of Trade: An Empirical Assessment (2000–2023)
13 pages
Smart Narrator Robot: Enhancing Experiential Learning through Conditional Autonomy
No ratings yet
Smart Narrator Robot: Enhancing Experiential Learning through Conditional Autonomy
6 pages
Developing Gamified Educational Technologies to Enhance Learning and Motivate Student Engagement in Education: A Quantitative Study Using Human-Computer Interaction (HCI)
No ratings yet
Developing Gamified Educational Technologies to Enhance Learning and Motivate Student Engagement in Education: A Quantitative Study Using Human-Computer Interaction (HCI)
16 pages
Unlocking the Therapeutic Power of Coriander: A Review of Coriandrum Sativum’s Bioactive Compounds and Health Benefits
No ratings yet
Unlocking the Therapeutic Power of Coriander: A Review of Coriandrum Sativum’s Bioactive Compounds and Health Benefits
15 pages
Optimal Voltage Regulation in Standalone Photovoltaic Systems Using Model Predictive Control and MOGA
No ratings yet
Optimal Voltage Regulation in Standalone Photovoltaic Systems Using Model Predictive Control and MOGA
8 pages
Perception, Attitude, and Readiness in Artificial Intelligence Adoption among Academic Librarians in the Bicol Region Librarians Council (BRLC)
No ratings yet
Perception, Attitude, and Readiness in Artificial Intelligence Adoption among Academic Librarians in the Bicol Region Librarians Council (BRLC)
6 pages
Assessing the Achievements of the Re-Alignment of an Industry Educatiocal Based System in Society
No ratings yet
Assessing the Achievements of the Re-Alignment of an Industry Educatiocal Based System in Society
5 pages
A MIC-MAC-Based Structural Exploration of Determinants Impacting Investment Sensitivity
No ratings yet
A MIC-MAC-Based Structural Exploration of Determinants Impacting Investment Sensitivity
8 pages
Analyzing Social Communication Deficits in Autism Using Wearable Sensors and Real-Time Affective Computing Systems
No ratings yet
Analyzing Social Communication Deficits in Autism Using Wearable Sensors and Real-Time Affective Computing Systems
17 pages
A Review on Gold Nanoparticles: Properties, Synthesis and Biomedical Application in Drug Delivery and Cancer Therapy
No ratings yet
A Review on Gold Nanoparticles: Properties, Synthesis and Biomedical Application in Drug Delivery and Cancer Therapy
6 pages
Analysis of the Role of Websites, Design, and Performance Metrics in Improving Company Performance in Medan City
No ratings yet
Analysis of the Role of Websites, Design, and Performance Metrics in Improving Company Performance in Medan City
4 pages
A Phytochemical Evaluation of Sierra Leonean Cassia siamea: A Source of Bioactive Compounds
No ratings yet
A Phytochemical Evaluation of Sierra Leonean Cassia siamea: A Source of Bioactive Compounds
5 pages
Enhancing Model Accuracy for Keypoint-Based Sign Language Recognition using Optimized Neural Network Architectures
No ratings yet
Enhancing Model Accuracy for Keypoint-Based Sign Language Recognition using Optimized Neural Network Architectures
7 pages
Real - Time Recognition of Cardiovascular Conditions from ECG Images with Deep Learning
No ratings yet
Real - Time Recognition of Cardiovascular Conditions from ECG Images with Deep Learning
10 pages
Cost Comparative Analysis of Solar/Utility and Diesel/Utility Hybrid Power System for a Typical Residential Building
No ratings yet
Cost Comparative Analysis of Solar/Utility and Diesel/Utility Hybrid Power System for a Typical Residential Building
8 pages
EduTech Portal: An AI-Powered Student Assistant Chatbot
No ratings yet
EduTech Portal: An AI-Powered Student Assistant Chatbot
12 pages
Architecture as a Reflection of Cultural Continuity: A Study of Traditional Trends
No ratings yet
Architecture as a Reflection of Cultural Continuity: A Study of Traditional Trends
3 pages
Development of Mirror Biosensor in Saliva pH Measurement in Health Services
No ratings yet
Development of Mirror Biosensor in Saliva pH Measurement in Health Services
7 pages
Design and Implementation of a GPS-GSM based Real-Time Vehicle Theft Tracking System for Urban Security in Uganda
No ratings yet
Design and Implementation of a GPS-GSM based Real-Time Vehicle Theft Tracking System for Urban Security in Uganda
7 pages
ResumeMatch: Intelligent Resume Enhancement & Job Fit Analysis
No ratings yet
ResumeMatch: Intelligent Resume Enhancement & Job Fit Analysis
7 pages
Continuing Training and Professional Performance of Primary School Teachers in Tchad: The Case of Teachers in the Farchana Refugee Camp
No ratings yet
Continuing Training and Professional Performance of Primary School Teachers in Tchad: The Case of Teachers in the Farchana Refugee Camp
7 pages
A Decade of Genome Editing: Comparative Review of ZFN, Talen, and CRISPR/CAS9
No ratings yet
A Decade of Genome Editing: Comparative Review of ZFN, Talen, and CRISPR/CAS9
10 pages
Behavior Addiction in Adolescents Post COVID 19: A Systematic Mental Health Review
No ratings yet
Behavior Addiction in Adolescents Post COVID 19: A Systematic Mental Health Review
8 pages
Analysis of the Export Competitiveness of Indonesia's Horticultural Fruit Products in the International Market
No ratings yet
Analysis of the Export Competitiveness of Indonesia's Horticultural Fruit Products in the International Market
8 pages
Enhancing the Robustness of Computer Vision Models to Adversarial Perturbations Using Multi-Scale Attention Mechanisms
No ratings yet
Enhancing the Robustness of Computer Vision Models to Adversarial Perturbations Using Multi-Scale Attention Mechanisms
14 pages
Evaluating the Impact of Shopee Mall on Consumer Purchase: Basis for Developing an Effective Marketing Plan
No ratings yet
Evaluating the Impact of Shopee Mall on Consumer Purchase: Basis for Developing an Effective Marketing Plan
61 pages
NSE Annual Report 2022-23-0 - 0
No ratings yet
NSE Annual Report 2022-23-0 - 0
436 pages
100 Chap 3 and 5 Solutions
0% (1)
100 Chap 3 and 5 Solutions
7 pages
Cable Fault Detector
No ratings yet
Cable Fault Detector
25 pages
For Sugar (MMT) 69.0 122.3 174.8 176.7 For Khandsari (MMT) 10.5 13.2 10.0 11.0 For Gur (MMT) 71.6 76.6 67.3 72.5 For Seed (MMT) 20.6 28.9 30.1 35.5
No ratings yet
For Sugar (MMT) 69.0 122.3 174.8 176.7 For Khandsari (MMT) 10.5 13.2 10.0 11.0 For Gur (MMT) 71.6 76.6 67.3 72.5 For Seed (MMT) 20.6 28.9 30.1 35.5
6 pages
EPAM Engineering KPI Dashboards For Hotel Industry Case Study
No ratings yet
EPAM Engineering KPI Dashboards For Hotel Industry Case Study
5 pages
Platforms of Power
No ratings yet
Platforms of Power
11 pages
04 2024 RG
No ratings yet
04 2024 RG
52 pages
1 Year Development Plan
No ratings yet
1 Year Development Plan
1 page
Specifications Formwork: Black
No ratings yet
Specifications Formwork: Black
7 pages
Порівняння SUN2000-30KTL-M3 VS Solis 30K-5G
No ratings yet
Порівняння SUN2000-30KTL-M3 VS Solis 30K-5G
8 pages
Limiting Factors Material New
No ratings yet
Limiting Factors Material New
12 pages
34-Mercado v. Atty. Vitriolo A.C. No. 5108 May 26, 2005
No ratings yet
34-Mercado v. Atty. Vitriolo A.C. No. 5108 May 26, 2005
4 pages
Exam 1 KeyFinance
No ratings yet
Exam 1 KeyFinance
7 pages
ICT 901-041 Course Outline
No ratings yet
ICT 901-041 Course Outline
7 pages
Check List (FPS FAS)
No ratings yet
Check List (FPS FAS)
2 pages
Spe 167350 Ms - Fast Screening Processes
No ratings yet
Spe 167350 Ms - Fast Screening Processes
15 pages
Manual Series 1 Land Rover
100% (1)
Manual Series 1 Land Rover
136 pages
Mechanical Behavior of The Reinforced Co
No ratings yet
Mechanical Behavior of The Reinforced Co
7 pages
Mad Microprojet
No ratings yet
Mad Microprojet
26 pages
CAP Veterans Association - 3 Oct 1945
No ratings yet
CAP Veterans Association - 3 Oct 1945
45 pages
View Answer & Discuss: WAEC 2019
No ratings yet
View Answer & Discuss: WAEC 2019
10 pages
Microcontroller Lab BECL456A Manual-PPP
No ratings yet
Microcontroller Lab BECL456A Manual-PPP
28 pages
May 23, 2014 Strathmore Times
No ratings yet
May 23, 2014 Strathmore Times
32 pages
Metal Forming Processes
No ratings yet
Metal Forming Processes
40 pages
Minilogue Korg Manual
No ratings yet
Minilogue Korg Manual
58 pages
Akram Palestinian Refugees and Their Legal Status
No ratings yet
Akram Palestinian Refugees and Their Legal Status
16 pages
Dissertation Isha Shah (2) .
No ratings yet
Dissertation Isha Shah (2) .
38 pages
Binti Nnedi Okorafor instant download
100% (5)
Binti Nnedi Okorafor instant download
24 pages
Report View of AP
No ratings yet
Report View of AP
50 pages

Automatic Image Caption Generation System

Uploaded by

Automatic Image Caption Generation System

Uploaded by

Volume 6, Issue 6, June-2021 International Journal of Innovative Science and Research Technology

ISSN No: 2456-2156

Automatic Image Caption Generation System

Image captioning is considered a bit more difficult in

IJISRT21JUN776 www.ijisrt.com 1034

Fig. 2: A simple neural network unrolled into simple neural

Fig.1: CNN Architecture

IJISRT21JUN776 www.ijisrt.com 1035

We are very grateful to all the teachers of our college

We want to convey our special thanks to

We would like to express our gratitude towards our

[1]. Fang, Hao, et al. "From captions to visual concepts and

IJISRT21JUN776 www.ijisrt.com 1036

IJISRT21JUN776 www.ijisrt.com 1037

You might also like