0% found this document useful (0 votes)
13 views90 pages

Final Report Corrected

This document is a major project report submitted by four students - Chhabi Raman Adhikari, Gaurav Bhattarai, Gaurav Sharma, and Pratik Thapa - to the Department of Electronics and Computer Engineering at Tribhuvan University Institute of Engineering in partial fulfillment of their Bachelor's degree. The project is titled "AI Powered Shopping List Generation And Recommendation System" and aims to help people easily and efficiently generate shopping lists using techniques like barcode scanning, image recognition, and speech recognition, and also provide item recommendations based on purchase history.

Uploaded by

prbthapa2055
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views90 pages

Final Report Corrected

This document is a major project report submitted by four students - Chhabi Raman Adhikari, Gaurav Bhattarai, Gaurav Sharma, and Pratik Thapa - to the Department of Electronics and Computer Engineering at Tribhuvan University Institute of Engineering in partial fulfillment of their Bachelor's degree. The project is titled "AI Powered Shopping List Generation And Recommendation System" and aims to help people easily and efficiently generate shopping lists using techniques like barcode scanning, image recognition, and speech recognition, and also provide item recommendations based on purchase history.

Uploaded by

prbthapa2055
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

TRIBHUVAN UNIVERSITY

INSTITUTE OF ENGINEERING
THAPATHALI CAMPUS

A Major Project Report


On
AI Powered Shopping List Generation
And Recommendation System

Submitted By:
Chhabi Raman Adhikari (THA074BEX008)
Gaurav Bhattarai (THA074BEX012)
Gaurav Sharma (THA074BEX013)
Pratik Thapa (THA074BEX023)

Submitted To:
Department of Electronics and Computer Engineering
Thapathali Campus
Kathmandu, Nepal

May 19, 2022


TRIBHUVAN UNIVERSITY
INSTITUTE OF ENGINEERING
THAPATHALI CAMPUS

Major Project Report


On
AI Powered Shopping List Generation
And Recommendation System

Submitted By:
Chhabi Raman Adhikari (THA074BEX008)
Gaurav Bhattarai (THA074BEX012)
Gaurav Sharma (THA074BEX013)
Pratik Thapa (THA074BEX023)

Submitted To:
Department of Electronics and Computer Engineering
Thapathali Campus
Kathmandu, Nepal

In partial fulfillment for the award of the Bachelor’s Degree in Electronics and
Communication Engineering.

Under the Supervision of


Er. Rama Bastola

May 19, 2022


DECLARATION

We hereby declare that the report of the project entitled “AI Powered Shopping List
Generation And Recommendation System” which is being submitted to the
Department of Electronics and Computer Engineering, IOE, Thapathali Campus,
in the partial fulfillment of the requirements for the award of the Degree of Bachelor of
Engineering in Electronics and Communication Engineering, is a bonafide report of
the work carried out by us. The materials contained in this report have not been
submitted to any University or Institution for the award of any degree and we are the
only author of this complete work and no sources other than the listed here have been
used in this work.

Chhabi Raman Adhikari (THA074BEX008) ________________________

Gaurav Bhattarai (THA074BEX012) ________________________

Gaurav Sharma (THA074BEX013) ________________________

Pratik Thapa (THA074BEX023) ________________________

Date: May 19, 2022

i
CERTIFICATE OF APPROVAL

The undersigned certify that they have read and recommended to the Department of
Electronics and Computer Engineering, IOE, Thapathali Campus, a major project
work entitled “AI Powered Shopping List Generation And Recommendation
System” submitted by Chhabi Raman Adhikari, Gaurav Bhattarai, Gaurav
Sharma and Pratik Thapa in partial fulfillment for the award of Bachelor’s Degree in
Electronics and Communication Engineering. The project was carried out under special
supervision and within the time frame prescribed by the syllabus.

We found the students to be hardworking, skilled, and ready to undertake any related
work to their field of study and hence we recommend the award of partial fulfillment
of the Bachelor’s degree in Electronics and Communication Engineering.

Project Supervisor
Er. Rama Bastola
Department of Electronics and Computer Engineering, Thapathali Campus

External Examiner
Dr. Shailesh Pandey
Martin Chautari, Thapathali, Kathmandu, Nepal

Project Coordinator
Er. Umesh Kanta Ghimire
Department of Electronics and Computer Engineering, Thapathali Campus

Head of Department
Er. Kiran Chandra Dahal
Department of Electronics and Computer Engineering, Thapathali Campus

May 19, 2022


ii
COPYRIGHT

The author has agreed that the library, Department of Electronics and Computer
Engineering, Thapathali Campus, may make this report freely available for inspection.
Moreover, the author has agreed that the permission for extensive copying of this
project work for the scholarly purpose may be granted by the professor/lecturer, who
supervised the project work recorded herein or, in their absence, by the head of the
department. It is understood that the recognition will be given to the author of this report
and the Department of Electronics and Computer Engineering, IOE, Thapathali
Campus in any use of the material of this report. Copying or publication or other use of
this report for financial gain without the approval of the Department of Electronics and
Computer Engineering, IOE, Thapathali Campus, and author’s written permission is
prohibited.

Request for permission to copy or to make any use of the material in this project in
whole or part should be addressed to the Department of Electronics and Computer
Engineering, IOE, Thapathali Campus.

iii
ACKNOWLEDGEMENT

It gives us immense pleasure to express our deepest sense of gratitude and sincere
thanks to our highly respected and esteemed guide Er. Rama Bastola for her valuable
guidance, encouragement and help for getting us involved in this project. Her useful
suggestions for this project and co-operative behavior are sincerely acknowledged.

We would like to express our sincere thanks to Er. Saroj Shakya for giving us this
guidance to undertake this project selection and drafting the respective report.

At the end we would like to express our sincere thanks to all our friends and others who
helped us directly or indirectly during this project selection and making us realize the
current real time problems that can be solved somehow through this project.

Chhabi Raman Adhikari (THA074BEX008)

Gaurav Bhattarai (THA074BEX012)

Gaurav Sharma (THA074BEX013)

Pratik Thapa (THA074BEX023)

iv
ABSTRACT

During the day-to-day activities, where we may have more important work to
remember and accomplish, many of us may not have time or may forget to
maintain the painful and tedious shopping list. The less useful items can be bought
during next shopping but forgetting the shopping of some needy products may create
difficult situations sometimes. The main aim of this project is to help people to generate
shopping list easily and efficiently. The user will be able to get a list of shopping items
that are listed in the website.

This project will help to process and identify items with the help of barcode scanning,
image recognition and speech recognition. The items having the barcode are scanned
with the barcode scanner and remaining items which lacks barcode are listed using
image recognition with the help of CNN. Items finding complexity in image recognition
are listed with the help of speech recognition. The website that is designed will give
the user interface in order to interact with the items of shopping list. The items that are
purchased can be removed manually from the list making it more efficient. The users
are also provided with the options to order listed items instantly through shopping
sites along with the item recommendation looking at user’s history of purchase.

Keywords: Barcode scanning, Image recognition, Item recommendation,

Speech recognition, Website

v
TABLE OF CONTENTS

DECLARATION.......................................................................................................... I

CERTIFICATE OF APPROVAL ............................................................................. II

COPYRIGHT ............................................................................................................ III

ACKNOWLEDGEMENT ........................................................................................ IV

ABSTRACT ................................................................................................................. V

LIST OF FIGURES .................................................................................................... X

LIST OF TABLES ................................................................................................... XII

LIST OF ABBREVIATIONS ............................................................................... XIII

1. INTRODUCTION............................................................................................... 1

1.1 BACKGROUND INFORMATION .......................................................................... 1


1.2 MOTIVATION.................................................................................................... 2
1.3 PROJECT OBJECTIVES ....................................................................................... 2
1.4 PROJECT APPLICATIONS ................................................................................... 3
1.5 SCOPE OF PROJECT ........................................................................................... 3
1.6 REPORT ORGANIZATION ................................................................................... 3

2. LITERATURE REVIEW .................................................................................. 5

3. REQUIRMENT ANALYSIS ............................................................................. 9

3.1 SOFTWARE REQUIREMENTS ............................................................................. 9


3.1.1 Django .................................................................................................... 9
3.1.2 TensorFlow ............................................................................................ 9
3.1.3 Kaggle .................................................................................................... 9

4. DATASET ANALYSIS ...................................................................................... 9

4.1 SELECTING THE DATASET ............................................................................... 12


4.2 PREPARING DATASET FOR TRAINING .............................................................. 12
4.3 SHUFFLING THE DATASET .............................................................................. 12
4.4 NORMALIZING THE TRAINING DATA ............................................................... 12
4.5 DEFINING, COMPILING AND TRAINING THE CNN MODEL ............................... 12
4.6 ACCURACY AND SCOPE OF MODEL ................................................................ 12

vi
5. SYSTEM ARCHITECTURE AND METHODOLOGY ............................... 14

5.1 SYSTEM BLOCK DIAGRAM ............................................................................. 14


5.2 ELABORATION OF WORKING PRINCIPLE ......................................................... 15
5.3 NEURAL NETWORK ARCHITECTURE .............................................................. 15
5.3.1 Convolutional Neural Network (CNN) ................................................ 17
5.3.1.1 Convolutional Layers: ........................................................... 19
5.3.1.2 Pooling Layer: ....................................................................... 20
5.3.1.3 Flattening:.............................................................................. 21
5.3.1.4 Fully Connected Layer: ......................................................... 22
5.3.1.5 Dropout.................................................................................. 23
5.3.1.6 Activation Function ............................................................... 23
5.3.2 AlexNet Algorithm .............................................................................. 25
5.3.3 VGG-16 Algorithm .............................................................................. 27
5.3.4 VGG-19 Algorithm .............................................................................. 28
5.4 RECOMMENDATION SYSTEM .......................................................................... 30
5.4.1 K means clustering Algorithm ............................................................. 30

6. IMPLEMENTATION DETAILS .................................................................... 32

6.1 TENSORFLOW LAYERS ................................................................................... 32


6.2 TRAINING PROCEDURE ................................................................................... 33
6.2.1 Libraries imported for image recognition ............................................ 34
6.2.2 Learning Rate ....................................................................................... 34
6.2.3 Confusion matrix ................................................................................. 35
6.2.4 Precision recall graph ........................................................................... 35
6.2.5 Regularization ...................................................................................... 35
6.2.6 Optimizers ............................................................................................ 35
6.2.7 Hyper-parameter Tuning ...................................................................... 36
6.3 WEBSITE DEVELOPMENT ............................................................................... 36
6.3.1 Creating virtual Environment .............................................................. 36
6.3.2 Django project creation ........................................................................ 36
6.3.3 Creating an App ................................................................................... 36
6.3.4 Changing in Model .............................................................................. 36
6.4 WEB SCRAPPING ............................................................................................ 37
6.5 BARCODE ANALYSIS ...................................................................................... 37

vii
6.5.1 Barcode ................................................................................................ 37
6.5.2 Barcode API ......................................................................................... 37
6.5.3 Barcode Generator ............................................................................... 38
6.5.4 Barcode information ............................................................................ 38
6.6 TEXTUAL CLUSTERING-BASED RECOMMENDATION SYSTEM (K-MEANS
ALGORITHM) ........................................................................................................ 38

6.6.1 Dataset.................................................................................................. 38
6.6.2 Libraries imported for Recommendation ............................................. 38
6.6.3 Text conversion .................................................................................... 39
6.6.4 Elbow method for optimum value of cluster ....................................... 40
6.6.5 Silhouette Plot for validation of cluster size ........................................ 40
6.6.6 Fitting K-means to the dataset ............................................................. 40
6.6.7 Predicting the cluster based on keywords given .................................. 40
6.7 SPEECH RECOGNITION ................................................................................... 40

7. RESULT AND ANALYSIS .............................................................................. 42

7.1 BARCODE RESULT.......................................................................................... 42


7.2 WEB SCRAPPING RESULT ................................................................................ 42
7.3 USER INTERFACE ........................................................................................... 43
7.4 WEBSITE RESULT ........................................................................................... 44
7.4.1 Shopping list ........................................................................................ 44
7.4.2 Instant Order ........................................................................................ 45
7.4.3 Website database .................................................................................. 46
7.5 CNN RESULTS WITH DATASET ....................................................................... 47
7.5.1 Accuracy of model with 2975 images ................................................. 47
7.5.2 5258 images (2975 original +2283 augmented images) plot ............... 48
7.5.3 Calculation of Learning rate ................................................................ 49
7.5.4 Training-validation accuracy and loss for AlexNet ............................. 50
7.5.5 Training-validation accuracy and loss for VGG-16............................. 51
7.5.6 Accuracy and loss for VGG-19............................................................ 51
7.5.7 Confusion matrix ................................................................................. 52
7.5.7.1 Confusion matrix for AlexNet............................................... 52
7.5.7.2 Confusion matrix for VGG-16 .............................................. 53
7.5.7.3 Confusion matrix for VGG-19 .............................................. 54

viii
7.5.8 Precision Recall Curve (PR curve) ...................................................... 55
7.6 RECOMMENDATION SYSTEM RESULTS ............................................................ 57
7.6.1 Input Dataset ........................................................................................ 57
7.6.2 Data conversion ................................................................................... 58
7.6.3 Elbow diagram ..................................................................................... 58
7.6.4 Visualization of cluster in linear diagram ............................................ 59
7.6.5 Silhouette Plot for cluster size obtained from Elbow method ............. 60
7.6.6 Top keywords of cluster....................................................................... 61
7.6.7 Recommendation of cluster ................................................................. 62
7.7 SPEECH RECOGNITION ................................................................................... 65
7.7.1 Output from Vosk ................................................................................ 65
7.7.2 Generate CSV file ................................................................................ 65
7.7.3 Shopping list generation ...................................................................... 65

8. FUTURE ENHANCEMENTS ......................................................................... 67

9. CONCLUSION ................................................................................................. 68

10. APPENDICES ................................................................................................... 69

APPENDIX A: PROJECT SCHEDULE ....................................................................... 69


APPENDIX B: VALUE OF ORIGINAL DATA ............................................................. 70
APPENDIX C: VALUE OF AUGMENTED DATA ........................................................ 71
APPENDIX D: TRAINING RESULT OF VGG19 MODEL ............................................ 72

REFERENCE ............................................................................................................. 74

ix
LIST OF FIGURES

Figure 5-1: Block diagram of System .......................................................................... 14


Figure 5-2: Neural Network Architecture .................................................................... 16
Figure 5-3: Convolutional Neural Network (CNN) ..................................................... 19
Figure 5-4: Convolutional Layer ................................................................................. 19
Figure 5-5: Pooling Layer ............................................................................................ 21
Figure 5-6: Flattening Operation ................................................................................. 22
Figure 5-7: Fully Connected Layer .............................................................................. 23
Figure 5-8: Softmax Function ...................................................................................... 24
Figure 5-9: Relu Function ............................................................................................ 24
Figure 5-10: AlexNet Architecture .............................................................................. 26
Figure 5-11: VGG 16 Algorithm ................................................................................. 27
Figure 5-12: VGG-19 Flowchart ................................................................................. 29
Figure 5-13: K-means algorithm flowchart ................................................................. 31
Figure 7-1: Barcode Result .......................................................................................... 42
Figure 7-2: Web scrapping result ................................................................................. 43
Figure 7-3: User Interface ............................................................................................ 44
Figure 7-4: Shopping list ............................................................................................. 45
Figure 7-5: Instant Order ............................................................................................. 46
Figure 7-6: Website database ....................................................................................... 47
Figure 7-7: Graph of non-augmented data ................................................................... 48
Figure 7-8: Graph of augmented data .......................................................................... 49
Figure 7-9: Learning rate for VGG19 .......................................................................... 50
Figure 7-10: Training and validation accuracy and loss curve for AlexNet model ..... 50
Figure 7-11: Accuracy and loss curve for VGG16 model ........................................... 51
Figure 7-12: Accuracy and loss curve for VGG19 model ........................................... 52
Figure 7-13: Confusion matrix form AlexNet model .................................................. 53
Figure 7-14: Confusion matrix form VGG16 model ................................................... 54
Figure 7-15: Confusion matrix form VGG19 model ................................................... 55
Figure 7-16: PR Curve for AlexNet model ................................................................. 56
Figure 7-17: PR Curve for VGG16 model ................................................................... 56
Figure 7-18: PR curve for VGG19 model.................................................................... 57
Figure 7-19: Sample of CSV dataset generated from Django database....................... 58

x
Figure 7-20: Sparse matrix of text data........................................................................ 58
Figure 7-21: Elbow Diagram for K-means algorithm.................................................. 59
Figure 7-22: Cluster visualization ................................................................................ 60
Figure 7-23: Silhouette Score plot for clustering ......................................................... 61
Figure 7-24: Top keywords of cluster .......................................................................... 62
Figure 7-25: Recommendation sample ........................................................................ 63
Figure 7-26: Recommended items ............................................................................... 64
Figure 7-27: Output from Vosk ................................................................................... 65
Figure 7-28: CSV file................................................................................................... 65
Figure 7-29: Grocery item added to the shopping list ................................................. 66
Figure 10-1: Value of original data ............................................................................. 70
Figure 10-2 Value of augmented data ......................................................................... 71
Figure 10-3 Training result of VGG19 model ............................................................. 73

xi
LIST OF TABLES

Table 4-1: Dataset ........................................................................................................ 13


Table 10-1: Gantt chart of Project Schedule ................................................................ 69

xii
LIST OF ABBREVIATIONS

AI Artificial Intelligence
CNN Convolution Neural Network
CV Computer Vision
MVC Model View Controller
NLP Natural Language Processing
IoT Internet of Things
UI User Interface
ReLU Rectified Linear Unit
VGG Visual Geometry Group
TF-IDF Term-Frequency/Inverse Document Frequency
WCSS Within-Cluster-Sum-of-Squares
PR Precision Recall

xiii
1. INTRODUCTION

1.1 Background Information

Our project is basically a combination of barcode scanning, image and speech


recognition for grocery items identification and listing them on a website to create a
shopping list for a particular user. Efforts are been made to create a website with good
user interface and features like product removal/deletion, item recommendation engine.
Users can purchase the items that are recommended or they added on list through
popular national and international ecommerce websites. An e-commerce website is a
secure platform for product selling and purchasing digitally. It also includes the secure
transaction between consumer and e-commerce site. Online shopping with the help of
e-commerce site has been a hub since many years. Various websites are available online
for e-commerce transactions. Some of the popular sites are Alibaba, Amazon, Sastodeal
etc. These sites have been serving consumers for decades. Various system has been
established for smart shopping using shopping list. Making the easy interface between
consumer and e-commerce sites, these types of projects are finding its applications in
various places. The motivation for selecting this idea as our major project is amazon
acquired international company product “Genican”. GeniCan however uses only
barcode scanning and voice recognition technology to create or update shopping list of
users in mobile app. We have incorporated new technology of image recognition along
with barcode scanner and voice recognition for ease of users along with other features
like items recommendation and instant order of added items.

Computer vision has been started in the areas where there requires the computer
analysis for image identification and classification. Image recognition is a part of
computer vision that has further enlighten the process of digital image detection, useful
pattern extraction and most importantly supported decision making and automation.
Similarly, recommendation system has also been helpful in various e-commerce sites
which ease user interface. There has been a lot of projects regarding the image
recognition and its analysis. This project has also been conceptualized from the same
with certain variation in it and some added functionality with a hope to inject its
usefulness in real-life.

1
Speech recognition as computer processing for speech recognition and speech text
translation has been employed for various projects. It involves processing of human
speech into written form and finally recognizing specific speech. Voice recognition is
a bit different than speech recognition where user’s voice is identified and no further
work is performed. Speech processing finds application in the many places like
generating product name form user speech.

1.2 Motivation

Since the introduction of computer vision and image processing, various projects have
been developed to get rid of tedious and time-consuming tasks. This project is
developed to provide better experience in shopping through automatic shopping list
generation and other integrated features. Prevailing shopping list generation systems
are focused on listing of the items only. Also, the existing shopping list generators do
not have features like recommendation system and ecommerce site integration. The
user has to put an extra effort to find the appropriate ecommerce website. Every time,
the user has to rely on input system to generate shopping list in the absence of
recommendation system. Considering these short comings of the existing systems, we
have worked on this project to provide a better solution by incorporating ecommerce
websites and recommendation system. This project aims in providing users with better
shopping experience and relieve them from troublesome process of making manual
shopping list. The user won’t be required to scroll through number of websites just for
finding the suitable option. Also, with the help of recommendation system, the user can
instantly add the items without using provided input methods.

1.3 Project Objectives

There are some objectives for which the projects are made. Similarly, this project also
has some specific objectives that are listed below.

• To generate shopping list using barcode, image and speech recognition

• To recommend the related items to the users.

2
1.4 Project Applications

This project finds its application in different fields with few or no modifications. The
applications are listed as follows.

• Hotels and restaurants– Since the daily consumptions of different items is in


large amounts, it is often challenging to keep track of all the items. The proposed
system can ease the process of buying the required products.
• Household – The products used in the house can be image processed and listed
automatically before it is thrown into dustbin. Then the list obtained can be
accessed later to shop the items as per the requirement.
• Shopping marts- With the help of few modifications in the dataset, the price of
the listed items can be obtained easily while shopping in marts.

This project also finds its applications in various other field like online shopping from
the list, expiry date alert system etc.

1.5 Scope of project

The developed system is a convenient method for generating shopping list. For
generating a list of products, the system is equipped with three different recognition
methods. The object thus recognized is listed and the user can access this list later and
specify the objects required. The system then generates the best match for the given
item and generates an interface for buying the product. Also, the system is equipped
with recommendation system that is handy just in case the user needs to order more
items.

1.6 Report organization

Regarding the report organization, the whole report can be divided into several
chapters. The introduction of the project topic has been described in chapter 1. Every
project has its specific objectives along with applications and scopes which is also
described in this chapter. Chapter 2 is all about background and literature review related
to the project. Requirement analysis of different software tools is done in chapter 3. The
project is associated with specific dataset and its analysis is also described which in
chapter 4. Different algorithms, methodology used in this project is described in chapter

3
5. The block diagram and its working principle in discussed in chapter 5. These
algorithm and methodology find their specific application in this project and its
implementation details in chapter 6. The output is discussed in chapter 7. Chapter 8
contains the future enhancements. Chapter 9 contains the conclusion of the project.
Chapter 10 includes the appendices related to the project.

4
2. LITERATURE REVIEW

There are many technological developments in internet-assisted looking within the past
few decades. Today’s reach of web and also the availableness of the mobile
development technology have actually contributed these innovative advancements.
varied alternative comes are exhausted the past associated with this project. One in all
the comes is that the hybrid looking list. Heinrichs F., Schreiber D. and Schöning J.
underneath the patronage of academician. Dr. Antonio Krüger, worked on the project
to form epitome for a hybrid mobile application combining the benefits of paper and
electronic looking lists [1]. Similarly, Marcus Liwicki and therefore the 3 team
members (Sandra Thieme, Gerrit Kahl, and Andreas Dengel) developed a system that
mechanically extracts the meant things for buying from a written looking list that is
termed intelligent looking list [2]. Another fascinating development would be the work
ended by faith and also the team relating to a developing a image for making a looking
list from multiple supply devices like desktop, good phones, telephone circuit or
telephone in several formats, basically structured text, audio, still pictures, video,
unstructured text and annotated media titled multimodal looking list [3]. Similar to
Marcus Liwicki’s study, Nurmi and therefore the cluster introduce a product retrieval
system that maps the content of looking lists written in different language into the
relevant real-world product during a food market [4]. GeniCan project was also done
previously with the facility of barcode scanner and speech recognition to generate
shopping list. The products were recognized before throwing in the dustbin and the list
was generated using mobile app. The device with camera was attached within the
dustbin [5].

Apart from the above projects we added few more features to make our project more
efficient and interactive. Recommendation system is used in this project for the benefit
of the users which recommends the items similar to the items that the user just bought.
Comparative analysis of the different shopping sites of the respective item and
recommending the best options is also available so that user will be able to find out best
among different alternatives. We also emphasize on image recognition of some of the
items which lack barcode.

5
Collaborative filtering recommendation technology will be divided into user-based and
item-based recommendation technology. User-based cooperative filtering
recommendation technology predict item ratings supported the ratings of different users
to come up with item recommendations. However, its recommendation quality is well
tormented by the poorness of user analysis information. The content-based
recommendation technology is to analyze the characteristics of the item content info
and calculate the matching degree with the user’s interest to recommend things.
Therefore, compared with cooperative filtering recommendation, content-based
recommendation is a smaller amount addicted to evaluation information. [6]

Since the unfold of the MVC (Model View Controller) pattern into wed development,
Python has provided quite few selections once it involves web frameworks, like
Django, TurboGears and Zope. Though selecting one out of the many may be confusing
initially, having many competing frameworks will solely be an honest factor for the
Python community, because it drives the event of all frameworks any and provides an
expensive set of choices to decide on. [7]

A real-world collaborative filtering recommendation system was enforced during a


giant Korean fashion company that sells fashion product through each on-line and
offline searching malls. The company’s recommendation surroundings displays the
subsequent distinctive characteristics: initial, the company’s on-line and offline stores
sell constant product. Second, fashion product area unit sometimes seasonal, thus
customers’ general preference changes in step with the time of year. Last, customers
sometimes purchase things to exchange antecedently most popular things or purchase
things to enhance those already bought. we have a tendency to propose a replacement
system referred to as K-RecSys, that extends the everyday item-based cooperative
filtering algorithmic program by reflective the on top of domain characteristics. K-
RecSys combines on-line product click information and offline product sale
information weighted to replicate the web and offline preferences of shoppers [8]. It
additionally adopts a preference decay operate to replicate changes in preferences over
time, and at last recommends substitute and complementary product using product class
info. In our project also we are planning to implement recommendation system so that
items can user will be recommended to select best sopping site as well as recommends
item that need to be purchased side by side.

6
There are unit four main ways in which recommender systems turn out an inventory of
recommendations for a user – content primarily based, Collaborative, Demographic and
hybrid filtering. In content-based filtering the model uses specifications of associate
item so as to suggest further things with similar properties. cooperative filtering uses
past behavior of the user like things that a user antecedently viewed or purchased, in
summation to any ratings the user gave those things rate and similar conclusions created
by alternative user's things list. To predicts things that the user could notice fascinating.
Demographic filtering is read user profile knowledge like age class, gender, education
and living space to seek out similarities with alternative profiles to urge a brand new
recommender list. Hybrid filtering combines all 3 filtering techniques. This project is
supposed to be proceed using Collaborative filtering. [9]Amazon.com launched item-
based collaborative filtering in 1998, enabling recommendations at a previously unseen
scale for millions of customers and a catalog of millions of items. [10]

We use speech recognition algorithms daily with our phones, computers, home
assistants, and more. Every of those systems use algorithms to convert the sound waves
into helpful knowledge for process that is then taken by the machine. a number of these
machines use older algorithms whereas the newer systems use neural networks to
interpret this knowledge. These systems then manufacture associate degree output
generated within the style of text to be used. an outsized quantity of coaching
knowledge is required to create these algorithms and neural networks perform
effectively. one amongst the more practical ways is that the Hidden Markov Model
(HMM) with associate degree finish purpose detection formula for pre-processing to
get rid of unwanted noise. The HMM needs the addition of different tools to properly
interpret speech. [11]

Data augmentation of the input options derived from the Short-Time Fourier remodel
(STFT) has become a preferred approach. However, for several speech process tasks,
there's proof that the mix of STFT-based and Hilbert–Huang remodel (HHT)-based
options improves the performance. The Hilbert spectrum will be obtained with
adaptative mode decomposition (AMD) techniques, that are noise-robust and
appropriate for non-linear and non-stationary signal analysis. [12]
Caffe framework and AlexNet model were used to extract the feature data regarding
pictures. AlexNet, as a representative of the Deep neural network, is associate degree

7
8-layer model. It contains five convolution layers and three fully-connected lays. Since
the deep structure and many parameters within the model, it gets additional options
from original information than ancient CNN. [13] K-means rule is applied to see
clusters of sentences for the formation of the ultimate outline. The worth of ‘K’ is set
exploitation the Elbow methodology. [14]. The scores for every sentence square
measure computed using Term-Frequency/Inverse Document Frequency (TF-IDF) of
constituent words and overlap with the title of the story and its point worth. Deep
learning has given way to a replacement in era of machine learning, apart from vision.
Convolutional neural networks are enforced in image classification, segmentation and
object detection. [15]
VGG19 could be a similar model design as VGG16 with 3 extra convolutional layers,
it consists of a complete of sixteen Convolution layers and three dense layers.
Following is that the design of VGG19 model. In VGG networks, the employment of
3x3 convolutions with stride one provides a good receptive filed cherish seven * seven.
This implies there are very less measuring parameters to evaluate. Each year ImageNet
competition is hosted within which the smaller version of this dataset (with one
thousand categories) is employed with an aim to accurately classify the photographs.
several winning solutions of the ImageNet Challenge have used state of the art
convolutional neural network architectures to beat the most effective potential accuracy
thresholds. [16]

8
3. REQUIRMENT ANALYSIS

3.1 Software Requirements

3.1.1 Django

Django is a Python web framework that allows the developers to create modern
websites. The pre-existing framework allows the users to create a website without
initiating from the scratch. Django has an active community with experienced
developers. Also, the professional documentation contributes a lot during the
development of websites. It offers options for free and paid-for support. Having written
in Python, Django is supported in number of platforms like Linux, Windows, Mac OS
to name a few. Also, web hosting providers provide necessary resources to handle
websites developed using Django. In this project, Django finds its applications in
website development which is the major platform for user interface in the project.
Django is used as our backend support to perform the CRUD (Create, Retrieve, Update,
Delete) operations on a table in a database. It also helped in the frontend works of our
web development.

3.1.2 TensorFlow

TensorFlow is the most used platform for machine learning due to its abundance of
community resources, tools and libraries. Data scientists and developers can efficiently
build and deploy ML powered applications. Keras is an API running on top of the
TensorFlow platform. The API is written in Python language. The library files of
TensorFlow are used for image recognition and speech recognition in this project.

3.1.3 Kaggle

Kaggle is a platform where the users can find relevant data sets, publish their own
datasets and work with the professionals and take part in various challenges related to
data science. The simplest and best-supported file type available for handling of tabular
data is the “Comma-Separated Value” abbreviated as CSV. Kaggle hosts competitions
aiming for development of machine learning. It is python base framework which got
lot of codes and libraries that will be helpful in various deep learning projects. Since
our personal computer lacked high performance graphics and other processing units,
Kaggle provides bunch of CPUS, GPU, memory (RAM) and other essential

9
components to develop the computationally complex and deep CNN models. We were
able to use Kaggle’s GPU for 30 hours in a week. Our project is also complex
processing type which is performed by using Kaggle.

10
4. DATASET ANALYSIS

Image classification deals with pattern observation of the dataset for extracting features

out of the image. We use filters while using CNNs. Filters help us exploit the spatial

locality of a particular image. It also enforces a local connectivity pattern between

neurons.

The source of our dataset are as follows:

➢ Google

The images with proper size and orientation were collected from

google. Very few of those images matched our image criteria and we

had to filter out many images that were not fit for our project.

➢ Kaggle

Some of the images were obtained from Kaggle datasets. We

downloaded those datasets and selected the images we needed as there

were many similar images only.

➢ Camera

We clicked the photo of the items we needed from various stores and

resized them according to our need to put them in our dataset manually.

Contents of the dataset are as follows:

The dataset contains various images of the grocery items (like apple, banana, maize,
cauliflower etc.) which were not recognized by bar code scanner. There are 19 classes
of items with many numbers of images in our dataset as shown in table 4.1. The number
of images in this dataset is 23927 in total among which 19142 images are used for
training and 4785 are used for validation of images.

11
The various steps that we followed for the CNN dataset can be listed as follows: -

4.1 Selecting the dataset

The dataset of our interest for image classification have been prepared by us. The JPEG
and PNG images are collected and are grouped and labelled with respect to folder name.
All together about 19 classes has been collected for our dataset and is zipped in order
make it available for the training in CNN. The items collected are of grocery type.
Respective folder name contains the dataset required for identifying the items after
training.

4.2 Preparing dataset for training

The dataset thus obtained are made ready for training. The resizing, categories or
labeling of images is done. The resizing of 224×224 is best for optimum output. There
is an array that maintains and make arrangement of image pixel values along with the
index for images in the list.

4.3 Shuffling the dataset

The shuffling of the dataset is done which gives better result. The random shuffling
helps to maintain the score of models.

4.4 Normalizing the training data

The data being forwarded is scaled to standardize the input to a layer. This helps in
stabilizing the learning process and leads to faster convergence.

4.5 Defining, compiling and training the CNN Model

The dataset is compiled with the CNN model. On basis of dataset volume and quality,
model can be best fit or not.

4.6 Accuracy and Scope of Model

The score and accuracy level of model can be obtained. It can be analyzed for many
times to get fine performance of CNN model.

12
Table 4-1: Dataset

Name of item No of images Name of item No of images

Apple 1785 Ginger 1062

Banana 1575 Maize vegetable 1230

Bitter Gourd 1160 Onion 1312

Broccoli 1192 Orange 1466

Capsicum 1854 Potato 1477

Carrot 1000 Pumpkin 1000

Cauliflower 1115 Radish 1317

Chili 1000 Tomato 1182

Dragon fruit 980 Total images 23927

Eggs 1057 Training images 19142

Garlic 1040 Validation images 4785

13
5. SYSTEM ARCHITECTURE AND METHODOLOGY

5.1 System Block Diagram

Figure 5-1: Block diagram of System

14
5.2 Elaboration of working Principle

The system is able to generate a list of items by recognizing the given item. In order to
recognize different types of items, the objects are classified as objects with barcode and
objects without barcode. The objects without bar cord are further classified into objects
that are recognizable through image recognition and objects that cannot be recognized
through the image recognition.

There are three different methods applied to determine the objects based on the
classification made. Objects with bar code are determined with the help of bar code
scanning. For the objects which do not contain barcode but are recognizable through
the image recognition process (for example: notebooks), the image is taken and
processed for image recognition. Finally, for the objects which neither contain bar code
nor are recognizable through image (for example: sugar), the system allows user to
enter the product name through voice recognition.

When the object is recognized, this object is added to the shopping list. The shopping
list is updated every time when a new object is entered. If the user needs to order an
item from the list, the system provides an interface to select the item that needs to be
ordered. Once the system knows what the user is willing to buy, the system goes
through the process of exploring the best alternatives within options and generates an
interface to allow the user to buy the item.

When the user is done purchasing the item, the recommendation system prompts the
user with items that are frequently bought together with the item the user has recently
bought.

5.3 Neural Network Architecture

Neural network is advanced code written with the quantity of straightforward,


extremely interconnected process parts. A Neural Network consists of associate input
and output layer with one or multiple hidden layers at intervals. Neural networks are
extremely economical in extracting pregnant data from unstructured knowledge and
inaccurate patterns and is ready to operate in period. A number of the foremost essential
real-world business applications of artificial neural networks employ sales predictions,

15
producing method management, risk management and mitigation, validation,
knowledge target selling, and client analysis. Extremely specialized uses of neural
networks embody detection of mines beneath the ocean, medium package recovery,
diagnosing of diseases, 3D seeing, face and speech recognition, handwriting
recognition, etc.

Figure 5-2: Neural Network Architecture

A nerve cell is that the basic unit that receive input from associate external supply or
alternative nodes. every node is connected with another node from subsequent layer,
and every such affiliation includes a specific weight. Weights are assigned to a nerve
cell supported its relative importance against alternative inputs. once all the node values
from the input layer are increased (along with their weight) and summarized, it
generates a worth for the primary hidden layer. supported the summarized price, the
primary hidden layer includes a predefined “activation” operate that determines
whether or not or not this node is “activated” and the way “active” it'll be. although
calculation is completed during this layer and it generates the worth for next hidden
layer. once passing through multiple hidden layers the output result obtained within the
output layer ultimately. the most computation of a Neural Network takes place within
the hidden layers. So, the hidden layer takes all the inputs from the input layer and
performs the required calculation to get a result. This result then forwarded to the output
layer so the user will read the results of the computation.
In a Neural Network, the learning (or training) process is initiated by dividing the data
into three different sets:

16
• Training dataset – This dataset allows the Neural Network to understand the
weights between nodes.
• Validation dataset – This dataset is used for fine-tuning the performance of the
Neural Network.
• Test dataset – This dataset is used to determine the accuracy and margin of error
of the Neural Network.
Once the data is segmented into these three parts, Neural Network algorithms are
applied to them for training the Neural Network.

5.3.1 Convolutional Neural Network (CNN)

CNN is a feed forward neural network that processes structured arrays of data such as
images. Convolutional neural networks picks patterns in the input image, such as lines,
gradients, circles, or even eyes and faces. Convolutional neural networks contain many
convolutional layers stacked on top of each other, each one capable of recognizing more
sophisticated shapes. CNN casts multiple layers on images and uses filtration to analyze
image inputs.

In case of deep neural networks neuron in a given layer is fully connected to all the
neurons in the previous layer. Because of large number of connections, parameters to
be learned increases. As the number of parameters increases the network becomes more
complex. This more complexity of the network leads to overfitting. Especially, in the
case of Image data being pixel values of the images as features, the number of input
features would be of large dimension. And the most of the pixel portions of the images
may not contribute in predicting the output. To overcome these challenges, the
Convolution Neural Networks were discovered. In this, the input image data will be
subjected to set of convolution operations such as filtration and max pooling. Then, the
resultant data which will be of lesser dimension compared to the original image data
will be subjected to fully connected layers to predict output. By performing the
convolution operations, the dimensionality of the data shrinks significantly large. This
decreases the parameters to be learned. Hence, the network complexity decreases which
leads to less chances of overfitting.

The reasons behind using CNN are as follows:

17
1. Convolutional layers make use of inherent properties of images.
2. CNN would give better performance when trained on shuffled images.
3. CNN take advantage of local spatial coherence of images. This means that they
are able to reduce dramatically the number of operations needed to process an
image by using convolution on patches of adjacent pixels, because adjacent
pixels together are meaningful. We also call that local connectivity. Each map
is then filled with the result of the convolution of a small patch of pixels, slid
with a window over the whole image.
4. There are also the pooling layers, which downscale the image. This is possible
because we retain throughout the network, features that are organized spatially
like an image, and thus downscaling them makes sense as reducing the size of
the image. On classic inputs you cannot downscale a vector, as there is no
coherence between an input and the one next to it.

As CNN provide us with these various features, we used it in our project rather than
other neural networks like DNN and RNN which are not feasible and are more complex
than CNN.

The various learning process that a computer uses to learn are supervised, unsupervised
and semi supervised. We used supervised learning for our CNN model. In supervised
learning all the observations within the dataset are tagged and therefore the algorithms
learn to predict the output from the input file. Likewise, we also have the input images
and output result and we trained our model to obtain the output from those inputs by
using CNN.

18
Figure 5-3: Convolutional Neural Network (CNN)

CNN is a combination of layers which transform an image into output that is understood
by the model. A CNN consists of following layers:

5.3.1.1 Convolutional Layers:

Convolutional Layers produce a feature map by applying filter that scans the image
researching no. of pixels at a time. the foremost common sort of convolution that's used
is that the 2D convolution layer and is abbreviated as conv2D. A filter slides over the
2D input file, performing element-wise multiplication. As a result, it sums up the results
into solo pixel. Constant operation is performed for each location it slides over,
reworking a 2D matrix of options into a unique 2D matrix of options.

Figure 5-4: Convolutional Layer

Feature map:

Feature maps are generated by applying Filters or Feature detectors to the input image
or the feature map output of the previous layers. Feature map image can give insight
into the inner representations for specific input for every of the Convolutional layers
within the model.

The steps to visualize the feature maps are:

1. Define a brand new model, that may take a picture because the input. The
output of the model are going to be feature maps, that are associated in

19
intermediate illustration for all layers once the primary layer. This is often
supported the model we've got used for coaching.

2. Load the input image that| we would like to look at the Feature map to know
which options were outstanding to classify the image.

3. Convert the image to NumPy array.

4. Normalize the array by rescaling it.

5. Run the input image through the visualization model to obtain all
intermediate representations for the input image.

6. Create the plot for all of the convolutional layers and the max pool layers
but not for the fully connected layer.

Zero Padding: Zero-padding refers to the method of symmetrically adding zeroes to the
input matrix. It’s a normally used modification that permits the dimensions of the input
to be adjusted to our demand. it's employed in planning the CNN layers once the scale
of the input volume ought to be preserved within the output volume.

Stride: Stride is the number of pixels shifts over the input matrix.

5.3.1.2 Pooling Layer:

The pooling layer operates upon every feature map one by one to make a replacement
set of identical variety of pooled feature maps. A pooling layer is accessorial once the
convolutional layer once a non-linearity has been applied to the feature maps output by
a convolution layer.

Pooling involves choosing a pooling operation very like a filter to be applied to feature
maps. The dimensions of filter is smaller than the dimensions of the feature map. This
suggests that the pooling layer can invariably scale back the dimensions of every feature

20
map by an element of two, reducing the number of pixels or values in every feature map
to 1 quarter the dimensions.

The common pooling functions utilized in the pooling operation square measure
average pooling wherever the typical worth for every patch on the feature maps is
calculated. the opposite one is most pooling wherever the most worth for every patch
of the feature map is calculated. Max pooling is just a rule to require the most of a
locality and it helps to proceed with the foremost necessary options from the image.
The results of employing a pooling layer and making down sampled or pooled feature
maps could be a summarized version of the options detected within the input. they're
helpful as little changes within the location of the feature within the input detected by
the convolutional layer can lead to a pooled feature map with the feature on identical
location. This capability accessorial by pooling is termed the model’s unchangeability
to native translation.

Figure 5-5: Pooling Layer

5.3.1.3 Flattening:

Output from the previous layers is flattened to a single vector so that they can be input
to the next layer.

21
Figure 5-6: Flattening Operation

5.3.1.4 Fully Connected Layer:

Fully Connected layers in neural networks are those layers wherever all the inputs from
one layer are connected to each activation unit of upcoming layer. In many machine
learning models, the previous couple of layers are full connected layers that compiles
the info extracted by previous layers to make the ultimate output. it's the second most
time consuming layer second to Convolution Layer.

After feature extraction we want to classify the info into varied categories, this may be
done employing an absolutely connected neural network. Adding absolutely connected
layers create the model end-to-end trainable. The absolutely connected layers learn a
perform between the high-level options given as associate degree output from the
convolutional layers.

22
Figure 5-7: Fully Connected Layer

5.3.1.5 Dropout

Once the options are connected to fully connected layer, it will result overfitting within
the dataset. Overfitting happens once a specific model performs therefore well on the
training knowledge inflicting a negative impact within the model’s performance once
used on a brand-new knowledge.

To overcome this downside, a dropout layer is employed whereby a couple of neurons


are born from the neural network throughout training method leading to reduced size
of the model. On passing a dropout of 0.2, 20 percentage of the nodes are born out
haphazardly from the neural network. Therefore, our model was able to give smart
accuracy with less overfitting.

5.3.1.6 Activation Function

One of the foremost necessary parameters of the CNN model is the activation function.
They're accustomed to learn and approximate any reasonably continuous and
sophisticated relationship between variables of the network. In easy words, it decides
that info of the model ought to fire within the forward direction and which of them
mustn't.

23
It adds non-linearity to the network. There are unremarkably used activation
functions like the ReLU, Softmax, tanH and also the Sigmoid functions. Every of those
functions have a selected usage. For a binary classification CNN model, sigmoid and
softmax functions are most well-liked and for a multi-class classification, typically
softmax is employed. We've used Relu and softmax in our project.

Figure 5-8: Softmax Function

Figure 5-9: Relu Function

24
5.3.2 AlexNet Algorithm

AlexNet is a CNN network with 8 layers. The pretrained network is capable of


classifying images into 1000 object categories. It has an input image size of 227-by-
227. The input is the images of size 227X227X3. This model was used at the starting
of our project as it is light weight and the required documentation was easily available
in the internet. Later we used VGG-16 and VGG-19 for further evaluation and to
increase the accuracy of the model.

25
Figure 5-10: AlexNet Architecture

26
5.3.3 VGG-16 Algorithm

Figure 5-11: VGG 16 Algorithm

27
VGG16 is a CNN model planned by K. Simonyan and A. Zisserman. ‘VGG’ is that
abbreviation for Visual Geometry Group, that is a group of researchers at the University
of Oxford who developed this design, and ‘16’ implies that this design has sixteen
layers. It is advance than AlexNet due to exchange of massive kernel-sized filters with
multiple 3×3 kernel-sized filters one once another.

5.3.4 VGG-19 Algorithm

The main idea of the VGG19 model is same as VGG16 except it has 19 layers. This
means that VGG19 has three more convolutional layers.

A fixed size of (224 * 224) RGB image was given as input to the current network. The
sole preprocessing that's done is that it subtracts the mean RGB price from every
component, computed over the full training set. Applying the kernels of (3 * 3) size
with a stride size of one pixel enabled them to hide the full notion of the image. spatial
artifact was accustomed preserve the spatial resolution of the image. Max pooling is
performed over a pair of 2*2-pixel windows with stride 2. This can be followed by
ReLU to introduce non-linearity to model, classify higher and to boost machine time
because the previous models used tanh or sigmoid.
VGG-19 used three fully connected layers where first two were of size 4096 and a layer
with 1000 channels for 1000-way ILSVRC classification. Final layer is a SoftMax
function.

28
Figure 5-12: VGG-19 Flowchart

The number of filters is increasing as we tend to go deeper. Therefore, it's extracting a


lot of options as we move deeper into the design. Also, the filter size is reducing, which
suggests the initial filter was larger and as we tend to move the filter size is decreasing,
leading to a decrease within the feature map shape.

29
5.4 Recommendation System

Recommendation systems used in this project is done using textual clustering with K-
means algorithm. Text clustering is that the task of grouping a group of unlabeled texts
in such the way that texts within the same cluster are alike to those in other clusters.
Text clustering algorithms methods text and verify if natural clusters exist within the
information. Here during this project, K suggests that clustering formula is employed.

5.4.1 K means clustering Algorithm

Clustering is unsupervised algorithm. It finds a structure in an array of unlabeled


information. It is process of gathering items into sets whose elements are similar in
some way. A cluster is therefore an accumulation of objects which are internally
coherent, but not similar to the items belonging to other clusters.
The algorithm involved in this project is described below:
1. Select ‘k’number of clusters to be made
2. Select k objects as the initial cluster center of cluster randomly
3. Repeat the process
3.1. Assign the new coming object to their closest cluster belonging
3.2. Calculate new clusters center with mean value of cluster points
4. Until no change occur in the cluster center i.e., the cluster center is not
changing its position any more OR objects are not changing its cluster and resulting
in the stopping criteria.

30
Figure 5-13: K-means algorithm flowchart

31
6. IMPLEMENTATION DETAILS

The project implementation details follow the following steps:

❖ Web scrapping of ecommerce site


❖ Scan the item with the barcode
❖ Recognize image of given dataset using CNN
❖ Website development to interface shopping list

Web scrapping is done for three ecommerce sites namely Alibaba, sastodeal and
amazon India. Python library are used for web scrapping. The information obtained
from this includes description, price, image URL and volume of the first three items for
searched result. Barcode monster API is used to interface the items with barcode.
Website development for user interface in done with the help of Django. The website
includes shopping list and instant order for added item.

6.1 TensorFlow Layers

Some of the layers of TensorFlow used in development of CNN model are: -

➢ tf.keras.layers.Conv2D
This layer creates a convolution kernel that is convolved with the layer input to
produce a tensor of outputs.

➢ tf. keras.layers.MaxPooling2D
This layer performs the down sampling of input along its spatial dimensions by
taking the maximum value of an input window .The window is shifted
by strides along each dimension and the max pooling is performed in every
strides.

➢ tf.keras.layers.Dropout
Over fitting is prevented by the dropout layer. This layer randomly sets input
units to 0 with a frequency of rate at each step during training time.

➢ tf.keras.layers.Flatten

32
It flattens output obtained from max pooling layer. It converts 3 dimensional
array into 2 dimension.

➢ tf.keras.layers.Dense
Dense layer is used in the final stage of neural network. It accepts input and
gives the output using activation function as relu and adding bias which helps
to classify the image.

6.2 Training Procedure

The training procedure of our project is explained below:

1. Data collection

We collected data from various sources according to our problem statement.

2. Data preprocessing

The data collected were preprocessed to make them the input that is fit for the
machine learning. We made the dataset of the images from various images
collected by resizing them to the same scale and by balancing the number of
images in all classes thereby making the machine learning model to process
those data easily.

The dataset is divided into training and testing dataset. The input as training
dataset is used to train the model while the testing dataset is used for testing
result(accuracy) of the machine.

3. Model selection

Since we are trying to identify the grocery items, we used the classification
model CNN.

4. Model training
33
The training dataset are trained using the model selected.

5. Evaluation

After training the dataset the test dataset is fed as input to the model and is
evaluated. If the evaluation is wrong, we change the parameters and train the
model again. We used K-fold cross validation, Confusion Matrix, Precision-
Recall Graphs, ROC Curves for evaluation of our model.

6. Performance tuning

Performance tuning is done to get the high accuracy in the output. We tune the
performance of the machine by various methods like increasing the number of
the epochs, batch size, image resolution, calculating the learning rate with the
help of graphs plotted between learning rate and loss, finding and deleting the
corrupt data in the dataset or changing the hyperparameters.

7. Prediction

We predict the outcome by taking the real time input after performance tuning.

6.2.1 Libraries imported for image recognition

➢ NumPy
NumPy is a library for handling the multidimensional arrays.
➢ Matplotlib
Matplotlib is a library for visualization of plots and graphs for
analyzing data.

6.2.2 Learning Rate

The learning rate changes the model according to error while updating the weights and
bias. Learning rate should be tuned properly. The high or low value may results for
improper training.

34
6.2.3 Confusion matrix

A confusion matrix contains the correctly classified and misclassified counts for each
class. The models confusion is calculated with the help of this matrix. The high value
in the diagonal results the good performance of the model. While making the prediction
for classification the model may be confused for classifying the items into false class.
The whole performance can be monitored by the help of this matrix.

6.2.4 Precision recall graph

A PR curve is a graph having Precision values on the y-axis and Recall values on the
x-axis. Precision is also called the Positive Predictive Value (PPV). Recall is called
Sensitivity, True Positive Rate (TPR).
Precision = TruePositives / (TruePositives + FalsePositives)
Recall = TruePositives / (TruePositives + FalseNegatives)

6.2.5 Regularization

Regularization is process of modifying a learning algorithm to favor simpler prediction


rules for avoiding overfitting. It refers to modifying the loss function to penalize certain
values of the weights you are learning. Regularization helps to make changes over layer
parameters for optimization by applying penalty. The loss function is generated by the
summation of penalties.

6.2.6 Optimizers

Optimizers are algorithms for changing the attributes of a neural network. The attributes
may be weights and learning rate which helps to reduce the losses. Optimization
algorithms are responsible for providing accurate results possible. There are many
optimizers, some of them are ADAM, Gradient Descent, Momentum. Among these
optimizers, for this project the ADAM optimizer used.
Adaptive Moment Estimation (ADAM) is an algorithm for optimization technique for
gradient descent. Dealing with big problem having lot of data can be easy by the use of
this methos. It requires less memory and is efficient.

35
6.2.7 Hyper-parameter Tuning

Hyper-parameter tuning is the selection of the parameters for algorithm. These


parameters are set prior to the learning process. The parameters which are tuned during
this process are learning rate, number of layers, regularization constant.

6.3 Website Development

Website development is performed using Django. It is free and open source web
framework based on python. Various progressive tasks performed are listed below:-

6.3.1 Creating virtual Environment

Virtual environment acts as dependencies to the Python related projects. It works as a


self-contained container or an isolated environment where all the Python related
packages and the required versions related to a specific project are installed. The virtual
environment is created by using 'python -m venv env', where env is our virtual
environment shown by 'ls' command.

6.3.2 Django project creation

The project is created using the “Django admin start-project ‘project name’” command.
The project thus created can be viewed in the browser.

6.3.3 Creating an App

The project is further processed by creating an app by using 'python manage.py


startapp app_name' command. In Django, there are many apps to the single project
where each app serves as single and specific functionality to the particular project.

6.3.4 Changing in Model

SQLite is used as the default database in Django. It uses 'Object Relational Mapper
(ORM)' which makes it really easy to work with the database. Inside 'blog/models.py',
we need to create a new model. This is a class that will become a database table
afterward which currently inherits from 'models.Model'.

36
6.4 Web scrapping

In order to access the data from ecommerce site we used web scraping. The web
scraping is done using python code. Especially three library functions are utilized for
this purpose. Beautiful Soup, pandas, and requests are installed. These library functions
help in web scraping in python. To do that we need to make an HTTP call first. Then
we extract the element using Beautiful Soup.

The python library ‘Beautiful soup’ extracts data from HTML and XML file. It relies
on a parser. The default parser used is LXML. It provides idiomatic ways of navigating,
searching and modifying the parse tree.

The scraping is done for three ecommerce websites. Alibaba, Amazon India and
Sastodeal are scrapped. From all the three websites the first three results of the searched
product are taken into consideration. The URL of the product, title, price and image
URL are taken as information from web scrapping. This information is used for
shopping list generation. Also, the user can directly do shopping from within the
website which will redirect it to the original ecommerce shopping site.

6.5 Barcode analysis

The items having barcode can be read using its barcode. Barcode of item is handled
with the help of barcode monster. The barcode analysis can be done in following steps:

6.5.1 Barcode

There are two types of barcodes supported by barcode monster namely EAN-8 and
EAN-13. EAN is standard barcode used in most of the products available. It is
compatible with UPC and JAN. UPC contains machine readable barcode and twelve
digit number underneath. In this project, EAN 13 format is used.

6.5.2 Barcode API

API defines the set of rules for programs enabling them to communicate with each other
while exposing data and functionality. Barcode Monster used in this project provides

37
some REST APIs for looking up barcode information. This API is accessed via HTTP
protocol.

6.5.3 Barcode Generator

The python library is used to generate barcode EAN format. This includes two steps.
One is getting barcode scanned and another is getting barcode. The library import ‘cv2
as cv’ is used to access the computer camera. The barcode is scanned with camera port
inbuilt in laptop. Then the library ‘pyzbar.pyzbar import decode’ helps to decode the
scanned barcode. In this way we can get numerical digits which can be accessed using
API.

6.5.4 Barcode information

Information is provided both in HTML and in JSON format. The HTML format works
by inserting the code after /code URL path. To use the API and get a JSON formatted
content, we can call /api/1234567890123 where 1234567890123 is the code we are
looking for. The barcode gives the information of company, description, image_URL
of the item.

6.6 Textual clustering-based recommendation system (K-means algorithm)

The project has no user past history that records the user behavior like ratings given to
the items listed in the shopping list. With this textual clustering-based recommendation
finds its application in this project. The description of item listed in the shopping list is
used to make clusters and recommend the item respectively.

6.6.1 Dataset

The csv file is imported from Django database in order to fed to recommendation model
as dataset. The information from Django database is exported dynamically and contains
the information of user and the product description in different column. The words in
the product description column are used for the textual clustering.

6.6.2 Libraries imported for Recommendation

Different libraries are imported for the recommendation model. They can be listed as:

38
➢ Sklearn
Sklearn is the popular python library providing diverse algorithms for
classification. It also performs clustering, and dimensionality reduction.
➢ TfidfVectorizer
Convert a collection of raw documents to a matrix of TF-IDF features. TF-IDF
(term frequency-inverse document frequency) is a statistical measure that
evaluates how relevant a word is to a document in a collection of documents.
➢ CountVectorizer
CountVectorizer may be a useful function provided by the scikit-learn library
in Python. Its function is to remodel a given text into a vector on the idea of the
frequency (count) of every word that happens within the entire text. This can be
useful after we have multiple such texts, and that we want to convert every word
in every text into vectors.

Term frequency-inverse document frequency may be a text vectorizer that transforms


the text into a usable vector. It combines two ideas, Term Frequency (TF) and
Document Frequency (DF). The term frequency is that the variety of occurrences of a
particular term in an exceedingly document. Term frequency indicates however vital a
particular term in an exceedingly document is. Term frequency represents each text
from the information as a matrix whose rows are the amount of documents and columns
are the amount of distinct terms throughout all documents. Inverse document frequency
(IDF) is that the weight of a term, it aims to cut back the load of a term if the term’s
occurrences are scattered throughout all the documents.

6.6.3 Text conversion

The texts are converted into numerical value for feature extraction and further analysis.
The resultant output after text conversion is a sparse matrix. The sparse matrix contains
the positional value rather than actual value which consumes less memory. The stop
word is also used during this process. The words which are generally filtered out before
processing a natural language are called stop words. These are actually the most
common words in any language.

39
6.6.4 Elbow method for optimum value of cluster

The Elbow method is used for determining the cluster size. The sum of squared distance
is calculated from the center of the cluster to its each point. The shape of the plot looks
like an elbow and the point of bending of elbow gives the value of cluster size. The
resources required for study and implementation of Elbow method were easily available
and was best suited for the project after our research.

6.6.5 Silhouette Plot for validation of cluster size

The cluster size obtained from the elbow technique is employed to come up with the
Silhouette plot. Silhouette analysis is accustomed study the separation distance between
the ensuing clusters. The silhouette plot displays a calculation of how every point within
one cluster is near to neighboring clusters and so provides some way to assess
parameters like range of clusters visually. This encompasses a range of [-1, 1].
Silhouette coefficients close to +1 indicate that the sample is way off from the
neighboring clusters. a worth of zero indicates that the sample is on or terribly on the
point of the choice boundary between 2 neighboring clusters and negative values
indicate that those samples might need been appointed to the incorrect cluster.

6.6.6 Fitting K-means to the dataset

After getting the optimum value of K from elbow method, K-means is fitted to the
dataset. It computes cluster centers and predict cluster index for each sample.

6.6.7 Predicting the cluster based on keywords given

Finally, the keywords are given to the model which then predicts the cluster. The
predicted cluster texts are then recommended to the user.

6.7 Speech Recognition

Vosk is an open-source and free Python toolkit used for offline speech recognition. It
supports speech recognition in many languages. It works offline on device like
Raspberry Pi, Android and iOS. It installs with “pip3 install vosk”.

40
It is portable per-language models (50Mb each), but there are much bigger server
models available including English, Indian English, French, Spanish, Portuguese,
Chinese, Russian, Turkish, German, Catalan. The user experience enhanced using
streaming API. It supports speaker identification also. The library used in Vosk are as
follows:
• Librosa
It is a library for retrieving the information form music.
• Scipy
It is used to manipulate and visualize the data. It is built on NumPy extension.

41
7. RESULT AND ANALYSIS

This result explains the barcode scanning done in this project.

7.1 Barcode Result

Items having the barcode can be easily entered in the shopping list as we can interface
the product with its barcode scanning. The python library files play the vital role in
scanning and decoding the barcode information. The items are scanned using the
camera of the laptop.

Figure 7-1: Barcode Result

Name of the item, company name, its description, image URL is shown after scanning
the item as shown in the figure above. This information is used to generate the shopping
list with image of the product and its price.

7.2 Web scrapping result

After web scrapping using the python libraries, we are able to access three e-commerce
websites. They are sastodeal, Alibaba and amazon. Following information are obtained
after web scrapping:

➢ URL of the product

42
➢ Title of the product
➢ Price of the product
➢ Image URL

This information is collected from the first three products from the respective e-
commerce website of the searched items. The result thus obtained is shown in the figure
below.

Figure 7-2: Web scrapping result

7.3 User Interface

The user will interface a screen where they are free to select the alternative for three
methods of shopping list generation. When pressed 1 on the keyboard, the barcode
scanning interface will be opened. Similarly, pressing 2 and 3 will help them to use
image recognition and speech recognition method respectively.

43
Figure 7-3: User Interface

7.4 Website Result

The website for this project is built with the help of Django. The appearance of the
website along with shopping list and instant order is obtained. The whole website can
be divided into two sections.

7.4.1 Shopping list

The items that are scanned with barcode are added to the shopping list along with the
following information.

➢ Added Date
➢ Added time
➢ Name of item
➢ Item Description(volume)
➢ Image obtained from URL

44
Figure 7-4: Shopping list

7.4.2 Instant Order

The items along with the description are presented in the instant order section from
which the users can place an order from the respective website. This section includes
the items price, image, description with order now option. After clicking on the order
now option it will redirect to the original website thereby placing the order.

45
Figure 7-5: Instant Order

7.4.3 Website database

The user database of the website is created as soon as the user login to the website. It
includes

➢ Username
➢ Company name
46
➢ Description of the product
➢ Image URL
➢ Date and time of the product added

Figure 7-6: Website database

This database differs from user to user. The product added will take database of
respective user id used during login process.

7.5 CNN Results with dataset

After generating the dataset of our interest, we feed this data for training the CNN
model. Normalization and shuffling of dataset is performed before passing to the model.
The split is done as 70% and 30 % for training and validation respectively. The result
was obtained using batch size=16 and epoch= 60. Various results are used for tuning
the model parameters. The results are obtained for various models namely VGG16,
VGG19 and AlexNet.

7.5.1 Accuracy of model with 2975 images

The dataset is also checked for its accuracy without augmentation. The result thus
obtained from the figure below.

47
Figure 7-7: Graph of non-augmented data

As we can see from the graph the spikes show that model is overfitted. So, we
augmented the available data with an expectation to obtain good model with good
generalization capability.

7.5.2 5258 images (2975 original +2283 augmented images) plot

Image data augmentation is a technique that can be used to artificially expand the size
of a training dataset by creating modified versions of images in the dataset. Image data
augmentation is used to expand the training dataset in order to improve the performance
and ability of the model to generalize. Training deep learning neural network models
on more data can result in more skillful models, and the augmentation techniques can
create variations of the images that can improve the ability of the fit models to
generalize what they have learned to new images. The result thus obtained can be shown
below.

48
Figure 7-8: Graph of augmented data

The amount of data was still not enough to build the model with acceptable accuracy.
So, we focused on collecting more image dataset. From the further study to obtain good
model we reduced the number of classes and increased the number of total images in
dataset to 23927. Further finding the optimal learning rate could help make the model
better.

7.5.3 Calculation of Learning rate

The learning rate is hyperparameter when configuring our neural network. Therefore,
it is vital to know how to investigate the effects of the learning rate on model
performance and to build an intuition about the dynamics of the learning rate on model
behavior. Curve is plotted with learning rate on horizontal axis and loss on vertical axis.
The value of learning rate is tuned in VGG-19 model. The value that best fitted in model
is 0.0001. In the absence of good learning rate, the accuracy of model is found to be
very less.

49
Figure 7-9: Learning rate for VGG19

We trained on 3 different models AlexNet, VGG-16 and VGG-19. We found that the
best architecture for our application was VGG-19 as suggested by the plots below.

7.5.4 Training-validation accuracy and loss for AlexNet

These accuracy and loss curves are plotted with epoch=60, batchsize=16 and learning
rate=0.0001.

Figure 7-10: Training and validation accuracy and loss curve for AlexNet model

50
With the loss vs epoch graph of AlexNet model, it can be concluded that the loss has
decreased drastically with increase in epoch. The curve of validation and training loss
is similar to each other. This suggests about a good fit model. Also, the accuracy has
been increased sharply in the first half of total training period i.e., Learning rate is high.
After that the learning rate is slower and validation curve has somehow managed to
follow the training curve.

7.5.5 Training-validation accuracy and loss for VGG-16

The graph of VGG16 model suggests that the accuracy is high during first quarter of
the total training period. The loss curve suggests that model is somehow overfitted.

Figure 7-11: Accuracy and loss curve for VGG16 model

7.5.6 Accuracy and loss for VGG-19

VGG19 model is better fitted in comparison to VGG16. The learning rate is a bit faster
than that of the VGG16. In all of the above graphs, after certain epochs the model
accuracy increases slowly.

51
Figure 7-12: Accuracy and loss curve for VGG19 model

7.5.7 Confusion matrix

The result of confusion matrix for different models are listed below. We can analyze
the model using these matrix elements.

7.5.7.1 Confusion matrix for AlexNet

It performed best on class label 5 and 14 which are carrot and orange class respectively.
It performed worst on class label 13 which is onion class. This model classified 4389
images correctly. Apple, capsicum and orange have been most correctly classified using
this model.

52
Figure 7-13: Confusion matrix form AlexNet model

7.5.7.2 Confusion matrix for VGG-16

Using VGG16 model, garlic has been misclassified most. Apple, banana, capsicum and
orange have been classified more correctly. It performed best on class label 5 and 14
which are carrot and orange class respectively. It performed worst on class label 13
which is onion class. This model classified 4473 images correctly.

53
Figure 7-14: Confusion matrix form VGG16 model

7.5.7.3 Confusion matrix for VGG-19

Using VGG19 model, chili has been most misclassified. And apple, banana, capsicum,
orange etc. have been classified more correctly. It performed best on class label 5 which
is carrot class and performed worst on class label 13 which is onion class. This model
classified 4425 images correctly.

54
Figure 7-15: Confusion matrix form VGG19 model

7.5.8 Precision Recall Curve (PR curve)

Precision-Recall is a measure of success of prediction when the classes are very


imbalanced. Precision is a measure of result relevancy, while recall is a measure of how
many truly relevant results are returned.

It shows the tradeoff between precision and recall. A high area under the curve
represents both high recall and high precision, where high precision relates to a low
false positive rate, and high recall relates to a low false negative rate. High scores for

55
both show that the classifier is returning accurate results as well as returning a majority
of all positive results .

Figure 7-16: PR Curve for AlexNet model

Figure 7-17: PR Curve for VGG16 model

56
Figure 7-18: PR curve for VGG19 model

The curve obtained from different models are quite alike excellent performance. The
curves have gone horizontally from top left and then vertically down which scores the
value close to 1. This suggest that the model and dataset fit well in case of analysis with
precision recall curve.

7.6 Recommendation system results

Different results are obtained from the textual clustering using K-means. Based on these
outputs the model can be analyzed.

7.6.1 Input Dataset

The csv dataset for the recommendation is obtained as follows. The dataset includes
two columns. One contains the user of website and the description of item added in the
shopping list.

57
Figure 7-19: Sample of CSV dataset generated from Django database

7.6.2 Data conversion

The text data is converted into numerical value for the feature extraction and analysis.
The generated sparse matrix is as shown below. The size of matrix generated is
5000*167. It includes the positional values of vectorized form of text data.

Figure 7-20: Sparse matrix of text data

7.6.3 Elbow diagram

The elbow diagram is plotted between WCSS and clusters size. The cluster size is
selected as the elbow point where the WCSS value is minimum. From the curve plotted,
we can conclude that value of k (cluster size) is 25.

58
Figure 7-21: Elbow Diagram for K-means algorithm

7.6.4 Visualization of cluster in linear diagram

The data fitted with K-means is then plotted in graph with “.” As indicator. The result
thus obtained is shown below. This shows different version of cluster with its respective
cluster number.

59
Figure 7-22: Cluster visualization

7.6.5 Silhouette Plot for cluster size obtained from Elbow method

On plotting the silhouette graph, the average value obtained is 0.75 which is very close
to 1. From this we can conclude that the cluster size obtained from the elbow method
is fit for the given data. That means the obtained score in this project is good and
acceptable score.

60
Figure 7-23: Silhouette Score plot for clustering

7.6.6 Top keywords of cluster

The clusters top keywords can also be visualized for analysis of how well cluster is
made. The keywords per cluster is shown below. For simplicity only 10 keyword per
cluster is printed.

61
Figure 7-24: Top keywords of cluster

7.6.7 Recommendation of cluster

On the basis of keyword given to the model, the respective cluster is recommended. In
the figure below, the keyword is tropical food. With this text as keyword, system
predicts cluster 4 with its top keywords as frankfurter, fruit, tropical, citrus, meat, pork,
beef, soda, berries and canned. This is how the text are recommended for the website
user as per their entry in shopping list.

62
Figure 7-25: Recommendation sample

The recommendation of the items from the three ecommerce sites with the order now
option is shown below:

63
Figure 7-26: Recommended items

64
7.7 Speech Recognition

7.7.1 Output from Vosk

The Vosk API is triggered using keyword Okay. Then the user is given the opportunity
to add the item to the shopping list.

Figure 7-27: Output from Vosk

7.7.2 Generate CSV file

The grocery item that the user wants to add to the list is successfully identified using
speech recognition and then the item is written to CSV file.

Figure 7-28: CSV file

7.7.3 Shopping list generation

After creating the CSV file, the data is transferred to the Django database. After that
the item is listed in the shopping list by extracting it from the database.
65
Figure 7-29: Grocery item added to the shopping list

66
8. FUTURE ENHANCEMENTS

Every project cannot be perfect enough to cover every aspect of related field and can
be improved. This project can also be enhanced in various fields. One of them is dataset.
The dataset used are of the grocery items only which can be further upgraded to other
fields like clothing, electronics and many more. The website interface can be made
more user friendly. The shopping list can be updated smartly like removing it
automatically if the user orders the items instantly, flushing the shopping list
periodically may be weekly, monthly and so on. The scanning of products can be
integrated directly into the website making it more effective. Furthermore, app
development can be associated with the project for reducing the time and effort of the
users. The exploration of more CNN models and increasing the number of classes of
dataset can still be done. Recommendation of the items can also be upgraded using
other filtering process. The recommendation system integrated in this project is a bit
less matured. The optimum result can be obtained by the application of ontology driven
recommendation system. With application of this recommendation system, the case of
giving unrelated recommendation can be decreased to the greater extent.

67
9. CONCLUSION

The project is completed using collaborative application of barcode scanning, image


recognition, speech processing and recommendation system for generating the
shopping list in the website logged in by the user. The project was executed in number
of steps. The first step was to make an entry of the object through any of the three
methods. The first method involved the use of image processing for recognizing the
object. The second method is the use of speech to make an entry about the object. The
third method could use the bar code printed on the product to extract the details about
the item. After knowing the object, an entry was made to the system for further steps.
The second step involved the generation of the list and list of the alternatives of websites
from where the user could buy the items. The websites’ list contains the detail about
the websites where these items were available along with their prices. Upon selecting
the best choice among the listed alternatives, the user would be redirected towards the
original page. In the third step, we used recommendation system to recommend further
products for the user. The user could select any items from the recommendation or
ignore the prompt.
Most of the grocery items can be identified by the project while there are some
exceptions due to the barcode API and small dataset. Speech recognition is affected by
the background noises to some extent. Due to the limitations imposed by most
ecommerce websites on web scraping, we were only able to scrap three websites for
searching the products.

68
10. APPENDICES

Appendix A: Project Schedule

Chart Title

20-Nov
12-Aug

28-Feb

19-Apr
23-Jun
4-May

1-Oct

9-Jan
Brainstorming

Documentation and literature reviewing

Learning about simulator

Learning webpage designing

Learning about image processing and AI

Learning about speech processing

Testing and debugging


Learning Learning Documentati
Learning Learning
Testing and about about image on and Brainstormin
webpage about
debugging speech processing literature g
designing simulator
processing and AI reviewing
Start Date 20-Dec 5-Sep 20-Aug 15-Jul 20-Jun 16-May 15-May
completed 75 173 179 168 71 293 31
incomplete 0 0 0 0 0 0 0

Table 10-1: Gantt chart of Project Schedule

69
Appendix B: Value of original data

Figure 10-1: Value of original data

70
Appendix C: Value of augmented data

Figure 10-2 Value of augmented data

71
Appendix D: Training result of VGG19 model

72
Figure 10-3 Training result of VGG19 model

73
REFERENCE

[1] F. S. D. &. S. J. Heinrichs, "The Hybrid Shopping List: Bridging the Gap Between
Physical and Digital Shopping Lists," In Proceedings of the 13th international
conference on Human computer interaction with mobile devices and services , pp.
251-254, 2011.

[2] M. T. S. K. G. &. D. A. Liwicki, "An intelligent shopping list-combining digital


paper with product ontologies," In International Conference on Knowledge-
Based and Intelligent Information and Engineering Systems4, pp. 187-19, 2011.

[3] J. G. R. &. D. M. Jain, "Multimodal Shopping Lists," In International Conference


on Human-Computer Interaction, pp. 39-47, 2009.

[4] P. L. E. B. W. F. P. &. K. J. Nurmi, "Product Retrieval for Grocery Stores," In


Proceedings of the 31st annual international ACM SIGIR conference on Research
and development in information retrieval, pp. 781-782, 2008.

[5] R. G. Bishop, "The Walls Have Ears-and Eyes-and Noses:," Home Smart Devices
and the Fourth Amendment, pp. 61, 667, 2019.

[6] L. &. M. X. Fu, "An Improved Recommendation Method Based on Content


Filtering and Collaborative Filtering," Complexity, 2021.

[7] A. Hourieh, learning website developmwnt with django, Packt Publishing Ltd,
2008.

[8] H. K. Y. S. &. C. K. J. Hwangbo, "Recommendation system development for


fashion retail e-commerce," Electronic Commerce Research and Applications,
vol. 28, pp. 94-101, 16 december 2018.

74
[9] M. H. K. M. H. &. I. M. H. Mohamed, "Recommender Systems Challenges and
Solutions," International Conference on Innovative Trends in Computer
Engineering (ITCE) , pp. 149-155, 2019.

[10] B. &. L. G. Smith, "Two Decades of Recommender," Two decades of


recommender systems at Amazon. com, vol. 21(3, pp. 12-18, 2017.

[11] S. V. P. R. J. K. C. A. &. W. J. Ault, "On speech recognition algorithms,"


International Journal of Machine Learning and Computing, vol. 8(6), pp. 518-
523, 2018.

[12] D. &. M. K. Vazhenina, ". End-to-end noisy speech recognition using Fourier and
Hilbert spectrum features," Electronics, vol. 9(7), p. 1157, 2020.

[13] Z.-W. a. J. Z. Yuan, "Feature extraction and image retrieval based on AlexNet,"
International Society for Optics and Photonics, vol. 10033, 2016.

[14] J. a. U. S. D’Silva, "Unsupervised automatic text summarization of Konkani texts


using K-means with Elbow method," Int J Eng Res Technol, pp. 2380-2384, 2020.

[15] H. A. V. a. D. F. Qassim, "Compressed residual-VGG16 CNN model for big data


places image recognition," 8th Annual Computing and Communication Workshop
and Conference (CCWC). IEEE, 2018.

[16] Anukool, "CNN Architectures : VGG, ResNet, Inception + TL," [Online].


Available: https://www.kaggle.com/anukool89/cnn-architectures-vgg-resnet-
inception-tl.

75

You might also like