Image Caption Generation Using Deep Neural Networks

The document discusses a study on image caption generation using deep neural networks, specifically employing CNN (ResNet50) and RNN (LSTM) architectures to create descriptive captions from images. The research utilizes the Flickr8k dataset and demonstrates that the ResNet50 model significantly outperforms the VGG16 model, achieving an accuracy of 73% compared to VGG16's 29%. Additionally, the generated captions are converted to speech using Google's Text-to-Speech API, highlighting the potential applications for visually impaired users.

Uploaded by

Nikhil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views3 pages

Image Caption Generation Using Deep Neural Networks

Uploaded by

Nikhil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

2022 International Conference for Advancement in Technology (ICONAT)

Goa, India. Jan 21-22, 2022

Image Caption Generation using Deep Neural

Networks
Sudhakar J Viswesh Iyer V Sree Sharmila T
Sri Sivasubramaniya Nadar College of Sri Sivasubramaniya Nadar College of Sri Sivasubramaniya Nadar College of
Engineering, Engineering, Engineering,
Chennai, India Chennai, India Chennai, India
[email protected] [email protected] [email protected]

Abstract— In recent years, computer vision has made implementation of the models we used (VGG16 and
significant progress, primarily in the field of image ResNet50) with comparison.
classification and object detection and recognition. Describing
the image content automatically using natural languages is
challenging and has a tremendous potential impact. Here, the
idea is to extract features from an image, generate captions,
and convert the generated captions to speech. This work
2022 International Conference for Advancement in Technology (ICONAT) | 978-1-6654-2577-3/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICONAT53423.2022.9726074

systematically analyses deep neural networks based image

caption generation. With an image as an input, the model can
output an English sentence that describes the content in the
image by CNN (Convolutional Neural Network), RNN
(Recurrent Neural Network), and sentence generation. The
generated caption is converted to audio using Google’s Text to
Speech (gTTS). These models are built on the Flickr 8k dataset
consisting of 8000+ images. Usually, human beings tend to
describe a scene using natural languages which are compact Fig. 1. Flickr 8k Image dataset
and concise. However, machine vision systems describe the
scene/image by taking an image that is a two-dimensional II. RELATED WORK
array. Human beings are competent because of their reasoning
and intelligence by combining relationships in images and
Keywords—Image Captioning, Deep Neural networks, CNN, objects. Creating an Image captioning system that mimics
RNN, Text-to-Speech. human language is a very challenging task. A single image
I. INTRODUCTION can be described by more than one sentence, which can be
used as a caption, leading to text summarization in NLP
Humans are capable of processing a large amount of (Natural Language Processing).
information in an instant. This information are most
probably pictures, videos, and anything in written format. There are many ways to generate a caption for an image.
Every image has a large amount of information through The most common methods are a generative-based method
which humans decipher it and process it, and their natural and retrieval method. One of the best models of retrieval
language is used to describe an image. Any individual can method was proposed and implemented by Girish Kulkarni,
generate multiple captions for the same image. If the same Vicente Ordonez, and Tamara L Berg, and it is called the
can be achieved through machines, it paves the way for Im2Txt model [4]. Their system consists of two parts –
simplifying multiple coherent tasks. However, generating Image matching and Caption generation.
captions for images is a very tedious and demanding task for An input image is provided to the model, and
the machinery of today’s world. Generating a caption using a consequently, matching images will be retrieved from the
machine includes a basic understanding of natural language database, which contains the images and their appropriate
processing and differentiating different objects, and captions. Once the images are found, it is compared with
correlating them. Earlier approaches were based on defined high-level objects from the original input images and
syntax, but this restricts the type of sentences created. matching images. The main disadvantage of such a retrieval-
Exploiting from the advancements in the field of image based method is that it can only generate captions already
classification and object detection, it becomes feasible to available in the dataset, and it can't generate genuine novel
automatically generate captions ranging from one or more captions.
sentences to understand the content of an image, which is
image captioning [1]. In present circumstances, many well- The limitations of retrieval-based method [7] are solved
designed deep networks are used in very massive databases. by generative-based models. It is used to create novel
Many architectures such as GoogLeNet, which s a 22-layer captions for the images. They are either pipeline-based
deep CNN, ResNet, and many types of VGG have been models or end-to-end models. The Pipeline-based model
introduced. The most commonly used datasets for image uses two separate and distinct learning processes where it
caption training are Flickr datasets, as shown in Fig. 1, which first identifies objects in an image and then provides the
includes thousands of processed images. result for modeling task. In end to end based model, both
language modeling and image recognition models are
In this paper, various existing image captioning models performed together. Both parts of the model learn
have been studied and how they generate a caption for the simultaneously in an end-to-end system. They are usually
images. We have also documented the results of our created using a combination of CNN and RNN.

Authorized licensed use limited to: INDIAN INST OF INFO TECH AND MANAGEMENT. Downloaded on March 19,2025 at 06:29:29 UTC from IEEE Xplore. Restrictions apply.
The show and Tell model proposed by Vinayals et al., [3] depth. Recurrent neural networks (RNN) is a class of deep
is a generative end-to-end model. It is of the forerunner neural networks that are helpful in modeling sequence data.
models that is used as a reference in image captioning as it They use patterns to predict the subsequent possible
uses recent advancements in captioning images and outcome. In the model used, Long Short Term Memory
recognizing images. It uses a combination of LSTM cells (LSTM) model is used as the RNN model, as shown in Fig.
and Inception version 3 (v3) model. 3.
All the above works pave the way for enhancing the B. Datasets
models to develop image captioning systems. Using CNN To predict any outcome of a system, training datasets are
and RNN is the most feasible and effective way to caption an a crucial factor. For caption generation, there are many
image through a dataset. image datasets available. The most common datasets are the
Our contribution to the existing models is by training Flickr dataset, Pascal dataset, and MSCOCO Dataset [2]. In
through Flickr8k datasets and obtaining weights of the this work, the Flickr8k dataset is used. This dataset contains
trained dataset through which image captioning can be done. a collection of different activities that are carried out
Conversion of the generated caption of the image to speech throughout the day with their related captions. First, every
for various useful amenities for visually impaired and image object in the image is labeled and followed by the
recognition in self-driving cars. description based on the objects mapped to an image.
Flickr8k dataset contains around 8091 images gathered from
III. IMAGE CATION GENERATION SYSTEM six different Flickr groups.
Humans have advanced levels of reasoning and are C. Implementation and Training Procedure
experienced in generating captions by incorporating objects
The features in an image are extracted by training the
and their relationship in an image. However, creating a
images from the datasets using convolutional neural
captioning system that precisely mimics humans is a
networks. Images are taken from the Flickr8k dataset and are
challenging task.
fed into the ResNet50 model, where image classification and
A. System Architecture vectors of the images are mapped. There are usually two
The Fig. 2 shows the architecture for image captioning is kinds of residual connections, and each has its calculation.
based on Convolutional Neural Networks (CNN) and The identity shortcuts (x) can be directly used when the input
Recurrent Neural Networks (RNN) [6]. and output are of the same dimensions [9], as shown in
Equation (1).
ൌ ሺǡ ሼܹ ௜ ሽሻ ൅ ‫ݔ‬ (1) [9]
The shortcut still performs identity mapping when the
dimensions vary, with extra zero entries padded with the
increased dimension. The projection shortcut is used to
match the dimension (done by 1*1 Conv) using the
following Equation (2).

‫ ݕ‬ൌ ‫ܨ‬ሺ‫ݔ‬ǡ ቀሺܹ ௜ ሽቁ ൅ ܹ‫ݔ‬ (2) [9]

Subsequently, the trained images are fed into RNN for
captioning of the images.
Fig. 2. System architecture of image captioning model
D. Data pre-Processing of Captions
Convolutional Neural Network, usually called CNN or In machine learning, data pre-processing is the easiest
ConvNet, is a class of deep neural networks commonly and cleanest method to clean the data to get error-free and
applied to analyze images. In the model used, ResNet50[8] is unified data. During data training, captions are the target
used as a CNN model since it prevents degradation and variables or outputs that the model is training to predict.
vanishing gradient problems [5] in the neural nets during Using the trained weights of the dataset, it becomes easier to
intensive training and helps in maintaining good accuracy. It test for various samples of data.
is 50 layers deep, and it can be optimized for increased
E. Text to Speech Conversion
Once the model generates the captions, Text-to-Speech
creates very humanlike raw audio data. It has a broad
category of custom voices to choose from. It is simply
incorporated into the system using gTTS API that converts
the caption to speech.
IV. RESULTS AND DISCUSSION
Multiple models were tried to train the dataset for better
results. The study experiments were carried on Flickr8k
Dataset.
A. Training Procedure using VGG16
The Flickr8k dataset contains 8091 images. Initially, the
Fig. 3. Architecture diagram of LSTM (RNN) CNN model used is the VGG16 network framework, as

2
Authorized licensed use limited to: INDIAN INST OF INFO TECH AND MANAGEMENT. Downloaded on March 19,2025 at 06:29:29 UTC from IEEE Xplore. Restrictions apply.
shown in Fig. 4 with image size 224 X 224. Using VGG16 as On generating the captions of the images, the accuracy is
a model [10], an estimated 29 percent was the training tabulated in Table 1. Table 1 concludes that the ResNet50
accuracy for the Flickr8k dataset. Image is passed through achieves an average accuracy of 79%, which is more
different layers of convolutional neural network with the accurate and better than VGG16 (29%).
kernel size of 3*3. Convolutional layers are followed by
three fully connected layers (the first two have 4096 V. CONCLUSION AND FUTURE WORK
channels, and the third has 1000 channels). Image captioning is a very challenging and demanding
problem in various scenarios in real-time. This paper focuses
on captioning an image using a Flickr8k dataset using
ResNet50 as a convolutional neural network and LSTM as a
recurrent neural network. Experimental analysis, testing, and
training of datasets were done for both VGG16 and
ResNet50 models. The results show that ResNet50 models
perform better than VGG16 with an accuracy of 73% with
ResNet50 and 29% with VGG16. The end caption is further
converted from text to speech using gTTS.
Future works will focus on training for a larger number
of images and datasets to improve the model's overall
accuracy.
REFERENCES
[1] Krizhevsky, Alex, I. Sutskever, and G. E. Hinton. "ImageNet
classification with deep convolutional neural networks."
Fig. 4. VGG16 Architecture layers International Conference on Neural Information Processing Systems
Curran Associates Inc. 1097-1105. (2012)
B. Training procedure using ResNet50 [2] Sandeep Kumar Dash, Shantanu Acharya, Partha Pakray, Ranjita
ResNet50, also called Residual Networks, was used as a Das1, Alexander Gelbukh, “Topic Based Image Caption Generation,"
Arabian Journal for Science and Engineering (2019).
CNN to train the dataset [11]. When ResNet50 is used as a
model (Fig. 5), an approximate 45% was obtained for [3] Show and Tell: A Neural Image Caption Generator by Oriol Vinyal,
Alexander Toshev, Samy Bengio, Dumitru Erhan, IEEE (2015).
training the model for 20 epochs, and 73% accuracy was
[4] Image2Text: A Multimodal Caption Generator by Chang Liu,
obtained for training the model for 50 epochs. Changhu Wang, Fuchun Sun, Yong Rui, ACM (2016).
[5] The Vanishing Gradient Problem During Learning Recurrent Neural
Nets and Problem Solutions by Sepp Hochreiter.
[6] Vaidehi Muley, Varsha Kesavan, Megha Kolhekar, “Deep Learning
based Automatic Image Caption Generation," Institute of Electrical
and Electronics Engineers (2020).
[7] Vijayaraju, Nivetha, "Image Retrieval Using Image Captioning," San
Jose State University (2019).
[8] Zhengkui Wang, Xiao Yue, Yan Chu, Lei Yu, Mikhailov Sergei,
“Automatic Image Captioning Based on ResNet50 and LSTM with
Soft Attention” (2020).
[9] Xiangyu Zhang, Kaiming He, Shaoqing Ren, Jian Sun "Deep
(a)
Residual Learning for Image Recognition," Microsoft Research,
(2015).
[10] Liang Bai, Shuang Liu, Yanli Hua, Haoran Wang
“Image Captioning Based on Deep Neural Networks” (2018).
[11] San Pa Pa Aung, Win Pa Pa, Tin Lay New, ” Automatic Image
Captioning using CNN and LSTM-Based Language Model," (2020).

(b)
Fig. 5. Feature extraction in (a) ResNet50 network (b) VGG-16 network

TABLE I. RESULTS & ACCURACY

Architecture Online Data Dataset Training
Name Accuracy
VGG(Existing 8091 Flickr 8k 0.29 (50
Model) Epoch)
ResNet50 8091 Flickr 8k 0.45(20
Epoch)
ResNet50 2624 Flickr 8k 0.73 (50
(Animals & Epoch)
Scenery)

3
Authorized licensed use limited to: INDIAN INST OF INFO TECH AND MANAGEMENT. Downloaded on March 19,2025 at 06:29:29 UTC from IEEE Xplore. Restrictions apply.

MEPC 60-13 - The Generation of Biocide Leaching Rate Estimates For Anti-Fouling Coatings and Their Use... (IPPIC)
No ratings yet
MEPC 60-13 - The Generation of Biocide Leaching Rate Estimates For Anti-Fouling Coatings and Their Use... (IPPIC)
18 pages
Grade 8 Math Projects
100% (1)
Grade 8 Math Projects
4 pages
And Sell Signals. The SMA Is An Average, or in Statistical Speak - The Mean. An Example of A Simple Moving
No ratings yet
And Sell Signals. The SMA Is An Average, or in Statistical Speak - The Mean. An Example of A Simple Moving
8 pages
FW-796-SAF-62.20-0023 - 0 Lifting Plan For Water Injection Pump
100% (4)
FW-796-SAF-62.20-0023 - 0 Lifting Plan For Water Injection Pump
33 pages
Image Caption Generator Report
No ratings yet
Image Caption Generator Report
27 pages
E-Let Review Mathematics Set 2
No ratings yet
E-Let Review Mathematics Set 2
53 pages
Building A Voice Based Image Caption Generator With Deep Learning
No ratings yet
Building A Voice Based Image Caption Generator With Deep Learning
6 pages
Ned - 2025 Jce Geography
No ratings yet
Ned - 2025 Jce Geography
12 pages
Visual Image Caption Generator Using Deep Learning
No ratings yet
Visual Image Caption Generator Using Deep Learning
7 pages
Assignment 2 (If Else If Ladder)
100% (1)
Assignment 2 (If Else If Ladder)
2 pages
Gui
No ratings yet
Gui
516 pages
Stoichiometric Calculations
100% (1)
Stoichiometric Calculations
33 pages
Paper 17881
No ratings yet
Paper 17881
6 pages
New PDF
No ratings yet
New PDF
48 pages
Photo-Realistic Forests in GIMP PDF
No ratings yet
Photo-Realistic Forests in GIMP PDF
6 pages
Bdctw401-Update Tile Works
No ratings yet
Bdctw401-Update Tile Works
47 pages
Major Report Final
No ratings yet
Major Report Final
40 pages
Oppe-2 (24 July) Java
No ratings yet
Oppe-2 (24 July) Java
16 pages
Papers
No ratings yet
Papers
9 pages
Mini Project Fln..
No ratings yet
Mini Project Fln..
51 pages
Aire Acondicionado LG
No ratings yet
Aire Acondicionado LG
78 pages
ROHAN PRASAD FinalProjectReport - Rohan Gamer
No ratings yet
ROHAN PRASAD FinalProjectReport - Rohan Gamer
39 pages
Lab 3 Communication System D
No ratings yet
Lab 3 Communication System D
14 pages
Image Caption Generator Using Deep Learning: Guided by Dr. Ch. Bindu Madhuri, M Tech, PH.D
No ratings yet
Image Caption Generator Using Deep Learning: Guided by Dr. Ch. Bindu Madhuri, M Tech, PH.D
9 pages
Survey Paper
No ratings yet
Survey Paper
9 pages
He 2017
No ratings yet
He 2017
8 pages
Generating Caption From Images Using Flickr Image Dataset
No ratings yet
Generating Caption From Images Using Flickr Image Dataset
7 pages
Image Caption Generator Using Deep Learning
No ratings yet
Image Caption Generator Using Deep Learning
5 pages
Image Captioning - A Deep Learning Approach Using CNN and LSTM Network
No ratings yet
Image Captioning - A Deep Learning Approach Using CNN and LSTM Network
6 pages
Image Captioning Synopsis
No ratings yet
Image Captioning Synopsis
17 pages
Image To Caption Generator
No ratings yet
Image To Caption Generator
7 pages
Ref 12
No ratings yet
Ref 12
7 pages
Image Caption Generator Using Deep Learning
No ratings yet
Image Caption Generator Using Deep Learning
8 pages
Probability - Questions
No ratings yet
Probability - Questions
20 pages
A Comparative Study of Machine Learning Based Image Captioning Models
No ratings yet
A Comparative Study of Machine Learning Based Image Captioning Models
6 pages
A Novel Approach of Image Caption Generator Using Deep Learning
No ratings yet
A Novel Approach of Image Caption Generator Using Deep Learning
6 pages
Two Tier LSTM Model
No ratings yet
Two Tier LSTM Model
13 pages
RP Springer
No ratings yet
RP Springer
10 pages
Synopsis May 2024 (Pradeep, Vikas) - 1
No ratings yet
Synopsis May 2024 (Pradeep, Vikas) - 1
14 pages
Image Caption Generation Research Paper
No ratings yet
Image Caption Generation Research Paper
9 pages
Gray Scale Image Captioning Using CNN and LSTM
No ratings yet
Gray Scale Image Captioning Using CNN and LSTM
8 pages
IJNRD2309143
No ratings yet
IJNRD2309143
11 pages
Project Report Image Captioning Models Prakhar Dhyani
No ratings yet
Project Report Image Captioning Models Prakhar Dhyani
8 pages
Image Caption Generation Research Paper
No ratings yet
Image Caption Generation Research Paper
8 pages
Project Review
No ratings yet
Project Review
12 pages
Image Caption Generator Using Deep Learning
No ratings yet
Image Caption Generator Using Deep Learning
5 pages
Image Caption Generator
No ratings yet
Image Caption Generator
6 pages
Image Captioning Using R-CNN & LSTM Deep Learning Model
No ratings yet
Image Captioning Using R-CNN & LSTM Deep Learning Model
4 pages
Lab Report 1
67% (3)
Lab Report 1
4 pages
Automatic Arabic Image Captioning Using RNN-LSTM-Based Language Model and CNN
No ratings yet
Automatic Arabic Image Captioning Using RNN-LSTM-Based Language Model and CNN
7 pages
Image Captioning Generator Using CNN and LSTM
No ratings yet
Image Captioning Generator Using CNN and LSTM
8 pages
DW & Caption Generator - Paper 1
No ratings yet
DW & Caption Generator - Paper 1
6 pages
Image Captioning
No ratings yet
Image Captioning
8 pages
Image Captionbot For Assistive Technology
No ratings yet
Image Captionbot For Assistive Technology
3 pages
A Novel Approach of Image Caption Generator Using Deep Learning
No ratings yet
A Novel Approach of Image Caption Generator Using Deep Learning
6 pages
Detection and Recognition of Objects in Image Caption Generator System A Deep Learning Approach
No ratings yet
Detection and Recognition of Objects in Image Caption Generator System A Deep Learning Approach
3 pages
Analysis of Reliability Parameters of Conveyor Bel PDF
No ratings yet
Analysis of Reliability Parameters of Conveyor Bel PDF
7 pages
Apply Deep Learning-Based CNN and LSTM For Visual Image Caption Generator
No ratings yet
Apply Deep Learning-Based CNN and LSTM For Visual Image Caption Generator
6 pages
Image Caption Generator by Using CNN and LSTM: International Journal For Multidisciplinary Research
No ratings yet
Image Caption Generator by Using CNN and LSTM: International Journal For Multidisciplinary Research
6 pages
(IJCST-V11I4P7) :dr. T. S. Suganya, Mrs. M. Divya, T. Santhosh Kumar, K. Prem Kumar
No ratings yet
(IJCST-V11I4P7) :dr. T. S. Suganya, Mrs. M. Divya, T. Santhosh Kumar, K. Prem Kumar
4 pages
Image Captioning - A Deep Learning Approach
No ratings yet
Image Captioning - A Deep Learning Approach
4 pages
DL Group 6 Rep
No ratings yet
DL Group 6 Rep
11 pages
Operating System Exercises - Chapter 5-Exr
No ratings yet
Operating System Exercises - Chapter 5-Exr
2 pages
Image Caption Bot With Keras and Speech Generation For
No ratings yet
Image Caption Bot With Keras and Speech Generation For
7 pages
Research Paper of Generating Caption From Image
No ratings yet
Research Paper of Generating Caption From Image
5 pages
Sunnit Singh Shivam Kumar Soham Chatterjee Abhishek Kumar Sujata Dawn MuHmt
No ratings yet
Sunnit Singh Shivam Kumar Soham Chatterjee Abhishek Kumar Sujata Dawn MuHmt
6 pages
Image Captioning Generator Using Deep Machine Learning
No ratings yet
Image Captioning Generator Using Deep Machine Learning
3 pages
Conference Paper A5
No ratings yet
Conference Paper A5
9 pages
IJIEMR March 2023 COPY RIGHT (2 Files Merged)
No ratings yet
IJIEMR March 2023 COPY RIGHT (2 Files Merged)
8 pages
Image Captioning Using Deep Learning Mait
No ratings yet
Image Captioning Using Deep Learning Mait
8 pages
Silver Nanoparticles Data
No ratings yet
Silver Nanoparticles Data
6 pages
Reduction To Diagnol Form
No ratings yet
Reduction To Diagnol Form
11 pages
Image Captioning
No ratings yet
Image Captioning
17 pages
Knowing Atoms Better: Htt0.p://phet - Colorado.edu/en/simulation/build-An-Atom
No ratings yet
Knowing Atoms Better: Htt0.p://phet - Colorado.edu/en/simulation/build-An-Atom
5 pages
Fin Irjmets1681386363
No ratings yet
Fin Irjmets1681386363
5 pages
Paper 91-Comparative Evaluation of CNN Architectures
No ratings yet
Paper 91-Comparative Evaluation of CNN Architectures
9 pages
A Modified Two-Step Sequential Spin-Coating Method For Perovskite Solar Cells Using CsI Containing Organic Salts in Mixed Ethanol Methanol Solvent
No ratings yet
A Modified Two-Step Sequential Spin-Coating Method For Perovskite Solar Cells Using CsI Containing Organic Salts in Mixed Ethanol Methanol Solvent
7 pages
Image Caption Generator Using AI: Review - 1
No ratings yet
Image Caption Generator Using AI: Review - 1
9 pages
Image Caption Generator
No ratings yet
Image Caption Generator
2 pages
Image Captioning: Department of Computer Science University of Engineering & Technology Taxila
No ratings yet
Image Captioning: Department of Computer Science University of Engineering & Technology Taxila
10 pages
Automatic Image Captioning Using Neural Networks
No ratings yet
Automatic Image Captioning Using Neural Networks
9 pages
Chapter 4 Part-1 Sawyer's Book
No ratings yet
Chapter 4 Part-1 Sawyer's Book
11 pages
The Marine Habitat: Essentials of Oceanography 7 Edition
No ratings yet
The Marine Habitat: Essentials of Oceanography 7 Edition
28 pages
Oop hw1
No ratings yet
Oop hw1
7 pages
Bahan Metilen Blue
No ratings yet
Bahan Metilen Blue
5 pages
Apxvrll13-C 3G+4G Open Ran
No ratings yet
Apxvrll13-C 3G+4G Open Ran
2 pages
PHSN 106 Chapter 1 Reading Journal
No ratings yet
PHSN 106 Chapter 1 Reading Journal
2 pages
A Heuristic Search Algorithm For Vehicle Routing Problems and GIS-based Vehicle Routing System Onboard
No ratings yet
A Heuristic Search Algorithm For Vehicle Routing Problems and GIS-based Vehicle Routing System Onboard
6 pages
Chemistry Unit 1 Review Sheet
No ratings yet
Chemistry Unit 1 Review Sheet
2 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

Image Caption Generation Using Deep Neural Networks

Uploaded by

Image Caption Generation Using Deep Neural Networks

Uploaded by

2022 International Conference for Advancement in Technology (ICONAT)

Goa, India. Jan 21-22, 2022

Image Caption Generation using Deep Neural

systematically analyses deep neural networks based image

978-1-6654-2577-3/22/$31.00 ©2022 IEEE 1

‫ ݕ‬ൌ ‫ܨ‬ሺ‫ݔ‬ǡ ቀሺܹ ௜ ሽቁ ൅ ܹ‫ݔ‬ (2) [9]

TABLE I. RESULTS & ACCURACY

You might also like