A Survey On Multiclass Image Classification Based On Inception-V3 Transfer Learning Model
A Survey On Multiclass Image Classification Based On Inception-V3 Transfer Learning Model
https://doi.org/10.22214/ijraset.2021.33018
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.429
Volume 9 Issue II Feb 2021- Available at www.ijraset.com
Abstract: Transfer learning is the reuse of a pre-trained model for a new problem, it is very popular nowadays in deep learning
because it can train deep neural networks with relatively little data, and it is very useful in data science because of most real
problems., you don't have millions of data points marked to train these complex models. Let's take a look at what transfer
learning is, how it works, why and when to use it. Includes several resources for models that have been previously trained in
learning transfers for example, when you train the classifier to predict whether an image contains food, you can use the
knowledge gained during training to recognize drinks, for example, if you trained a simple classifier to predict, if the image
includes a backpack, you can use the knowledge gained by the model during training
Index Terms: Food image, Transfer learning, Inception-v3.
I. INTRODUCTION
Several approaches have been done to classify food from images. In previous years many feature based model is being used to
classify food images. SCD , EFD , GFD and LBP are the common features that has been used to classify food images. In modern
literature there are neural networks especially convolutional neural networks have been used to classify food images.
There is a way to classify food images using spherical surfaces. Machines that support vector machines efficiently perform
nonlinear classification using kernel tricks, apply this method to a food log data set consisting of 6512 images, and split using an
FCM algorithm similar to the k-means clustering algorithm. Can be applied to food images. , The coefficient is assigned. For each
data point in the cluster, the centroid is calculated randomly for each cluster and a coefficient is calculated for each data point. After
applying the FCM to segment the food image, I used a spherical support and the accuracy for classifying the Food 101 data set is 95
.Classifier. The Random Forest or the Random Forests is a collaborative way of dividing, retreating, and other activities that involve
building a series of decision-making trees during training and taking classes in class mode (phases) or intermediate predictions
(retreat). with an accuracy of 50.76 using the RFDC method.
Learning is a tool for improving the performance of model domain targets in that case the target domain label is not enough,
otherwise the moving knowledge is meaningless. So far, most studies of learning are focused only on small scale of data, which
cannot also reflect the potential of learning on the machine regularly learning techniques. Future challenges of learning should be in
two aspects: 1) how to exploit information that will be useful for regional targets from high noise source data domain and 2) How to
expand the current transfer of learning methods to deal with large scale of Data Domain sources.
C. Inception- Overview
In this paper, Inception, it was developed according to the Google Net architecture seen in ILSVRC 2014. It is also inspired by the
method based on primate visual cortex dictated by Serre et al. , which can capture scales. many sizes, one of the key criteria of fund-
forming architecture, is the adaptation of the network "in the network" method. Lin et al, which increases the power of artificial
neural networks. reduction in size is 1 × 1. the purpose of fund architecture is to reduce the use of resources to classify. Accurate
images use deep learning. they focus on finding the best position between traditional methods of optimization. This increases the
size and depth, and use sparsity in layers depending on the theoretical area set by Arora et al.. It itself can pay a lot of calculated
resources for deep learning systems such as establishing funds which use filters, in their 22 layers architecture, which is the main
goal to achieve them, emphasizing the approach ofArora et al.to generate a correlation statistical analysis to generate groups of
higher correlation to feed forward to the next layer. And they took the idea of multiscale analysis of visual information in their 1 × 1
, 3 × 3 and 5 × 5 convolution layers. All of these layers then go through dimension reduction to
end up in 1 × 1 convolutions .
The Inception architecture used in ILSVRC 2014 had the following structure as denoted by Szegedy et al.:
1) An average pooling layer having 5 × 5 filter size and stride 3.
2) A 1 × 1 layer with 128 filters for dimension reduction and rectified linear activation.
3) A fully connected layer having1024 units and rectified linear activation.
4) A dropout layer having 70% ratio of dropped outputs.
Neural Networks often try to detect edges in earlier layers, shapes in the middle layer, and some features specific to tasks in the
following layers. It helps leverage the labeled data of the task for which it was originally trained. The model has learned to
recognize objects, so we will only retrain the following layers.
During transfer learning, we try to transfer as much knowledge as possible from the previous task that the trained model has to the
new task at hand. This knowledge can take many different forms depending on the problem and the data. For example, it could be
the way models are constructed, allowing us to more easily define new objects. , a lot of data is needed to train a neural network
from scratch but not always have access to that data available - this is where transfer learning becomes useful. because the model
has been trained in advance. This is especially valuable in natural language processing since most of the expertise is required to
create large-labeled datasets. Also, training time is reduced because it can sometimes take days or even weeks to train a deep neural
network from scratch for a complex task.