0% found this document useful (0 votes)
12 views

Application of Transfer Learning For Image Classification On Dataset With Not Mutually Exclusive Classes

Uploaded by

Bappy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Application of Transfer Learning For Image Classification On Dataset With Not Mutually Exclusive Classes

Uploaded by

Bappy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Application of Transfer Learning for Image Classification on Dataset with

Not Mutually Exclusive Classes


2021 36th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC) | 978-1-6654-3553-6/21/$31.00 ©2021 IEEE | DOI: 10.1109/ITC-CSCC52171.2021.9501424

Jiayi Fan1, JangHyeon Lee2, YongKeun Lee1


1
Graduate School of Nano IT Design Fusion, Seoul National University of Science and
Technology, Seoul, Korea ([email protected], [email protected])
2
Department of Materials Science and Engineering, Korea University, Seoul, Korea
([email protected])

Abstract dataset like the ImageNet database, and trained on


powerful machines. They are efficient in classifying a
Machine learning technologies, especially deep large number of image classes and provide good
convolution neural network (CNN), play an important accuracy. However, the drawbacks of the deep CNN
role in image classification tasks. However, are also apparent. Large image datasets are required,
performing image classification tasks using state-of- which is not always available, and training cost is high
the-art deep learning models might suffer from the since there are millions of parameters in the network
lack of available images for network training and that need to be trained.
requirement of computationally powerful machines to In order to overcome the drawbacks of the deep
conduct the training. In order to classify new classes, CNN models, transfer learning is proposed to tailor
in this paper, transfer learning models are built based the deep networks to suit the need for classification
on the pretrained AlexNet and the VGG16 to for small and custom datasets. The concept of transfer
overcome the drawbacks of the deep CNN. The models learning mimics human learning behavior in that the
are used on a not well-classified image dataset, where already obtained knowledge can be transferred to the
classes of the images are not mutually exclusive, and learning of new things. Various transfer learning
an image could belong to more than one classes. algorithms have been proposed in visual
Experimental results are given to evaluate the categorization applications, classified into feature
performance of the transfer learning approach on this representation transfer and classifier-based
not exclusive dataset, and the conventional CNN are knowledge transfer [7]. Transfer learning is common
used as the benchmark. It shows that the transfer practice in deep learning applications that takes a
learning models outperform the conventional CNN by pretrained network for feature extraction and transfers
a large margin in both coupled and decoupled the learned features to solve new classification tasks.
datasets. With transferred layers that carry features learned
from hundreds of thousands of images, the newly
Keywords: Convolutional neural network, deep formed network does not need to be trained with a
learning, transfer learning. large dataset. Instead, it can quickly learn a new task
by just feeding a small number of training images.
1. Introduction Another obvious advantage of the transfer learning
approach with pretrained deep CNN is that the
With the booming development of artificial convolutional features learned from large dataset are
intelligence, deep learning technology, especially already stored in the weights of the transferred layers.
deep convolutional neural network (CNN), has When train the network for new classification tasks,
become a rather powerful tool for image classification these large numbers of weights can be frozen such that
tasks [1-3]. It has been applied in numerous the training cost is significantly reduced. Transfer
applications and found its way into almost every learning with pretrained CNN has been actively used
aspect of human life. The CNN learn features of to classify various interests and shows prominent
images by performing convolutional operations and results [8-10].
translating to human-identifiable classes in the output. In this paper, transfer learning models based on the
A large number of deep neural network models have pretrained deep CNNs, namely the AlexNet and the
been proposed to perform image classification tasks VGG16, are built and evaluated for image
such as AlexNet [4], ResNet [5], MobileNet [6], and classification application on a not well-defined dataset,
GoogleNet. These deep networks are usually trained where classes are not mutually exclusive. The purpose
with more than a million images in a vast image of using the transfer learning approach is to overcome

Authorized licensed use limited to: Frankfurt University of Applied Sciences. Downloaded on June 17,2022 at 15:30:12 UTC from IEEE Xplore. Restrictions apply.
Transferred
the limitations of the deep CNN, such as requirement layers
Pretrained
network
of a large training dataset and heavy computational
cost. For dataset with well decoupled classes, the
Conv
conventional CNN can perform classification tasks ImageNet

FC
FC

FC
dataset layers
with fair accuracy. However, it will suffer if the
dataset contains coupled classes. Experimental results
show that the deep CNN-based transfer learning Transfer learning Train custom
classifier
models can classify images with better accuracy than
the conventional CNN, and wins by a large margin, no Custom Conv Custom

FC
FC

FC
dataset layers classification task
matter if the classes are coupled or decoupled.
Custom
2. Transfer Learning Using Pretrained layers

Network Fig. 1. Comparison of the structures of the pretrained


network and the transfer learning network.
2.1 AlexNet

AlexNet, developed by Alex Krizhevsky, is a fast contains 1,000 channels for 1,000 classes in ImageNet
GPU-implementation of CNN that won the ImageNet Challenge. A softmax layer is used as the final layer.
contest in 2012. It is incredibly capable of achieving
high accuracy on very challenging datasets. It is 2.3 Transfer Learning Models
trained to classify more than a million images in
ImageNet into a thousand different categories, and The transfer learning approach in terms of CNN is
achieved 37.5% top-1 error rates and 17.0% top-5 by replacing the original classifier in the pretrained
error rates on test data. network with a new classifier to classify images in
The network contains five convolutional layers and new datasets, and the rest of the structure of the newly
followed by three fully-connected layers. Then the formed transfer network is the same as the pretrained
output of the last fully-connected layer is fed to a network except for the last layers. The pretrained
softmax layer to classify 1,000 classes. networks used to build the transfer learning models
are AlexNet and VGG16; they have been trained with
2.2 VGG16 millions of images beforehand. The comparison of the
structure of the pretrained network and the new
VGG16 is another deep CNN which surpasses network for specific image classification purposes is
AlexNet. The 16 indicates that the network is 16 shown in Fig. 1. In order to utilize the pretrained
layers deep. Proposed by Karen Simonyan and network in the newly constructed network, the last
Andrew Zisserman from Visual Geometry Group Lab fully-connected layer of AlexNet and VGG16 is
of Oxford University, VGG16 achieves 92.7% top-5 replaced with a new classifier, which contains a fully-
test accuracy on the ImageNet dataset, and won first connected layer with the size same as the number of
place in localization track, second place in classes in the new dataset, followed by a softmax layer
classification track in the 2014 ImageNet Challenge. and a classification output layer. In order to transfer
The work behind VGG16 investigates the impact of the learned features in AlexNet and VGG16 to the
network depth on the accuracy of large-scale image new classification network, the weights in the
recognition tasks. It is found out that the increase of transferred layers are kept frozen, only the weights in
the network depth by using small convolutional filters the newly added layers are trained to fulfill the new
can significantly improve the results from the prior- classification tasks.
art configurations.
The model takes 224 x 224 RGB image as input, 3. Experimental Results and Discussion
and then the data is passed through multiple
convolutional layers. Within the convolutional layers, A dataset from Kaggle which contains abstract
filters with small 3 x 3 receptive field are used to categories has been used to evaluate the performance
capture the notion of left/right, up/down, and center. of the transferred networks. The images in this dataset
Also, 1 x 1 convolutional filters are used as a linear (namely Dataset I) are scrapped from the Unsplash
transformation of input channels. The convolutional website, and they are classified into 16 different
stride is 1 pixel, and the spatial padding is 1 pixel. Five categories. Each class contains 500 images. The
max-pooling layers are inserted at specific locations, classes of the Dataset I are not mutually exclusive
and are performed over a 2 x 2 pixel window with from each other, which include animals, architecture,
stride 2. Three fully-connected layers are put at the arts-culture, athletics, business-work, fashion, food-
end of the stack of convolutional layers in which the
first one contains 4,096 channels, and the last one

Authorized licensed use limited to: Frankfurt University of Applied Sciences. Downloaded on June 17,2022 at 15:30:12 UTC from IEEE Xplore. Restrictions apply.
Fig. 2. Image examples from the custom dataset.
Fig. 3. Training results of AlexNet-based transfer
learning model.
drink, health, history, interiors, nature, people, street-
photography, technology, textures-patterns, and
travel, as shown in Fig. 2. Some classes have overlap,
such as animals/nature, health/food-drink, street-
photography/travel. The images might belong to
different classes. However, an image only has one
label in the dataset.
Dataset I is divided into a training set, a validation
set, and a test set, which contain 4,800 images, 1,600
images and 1,600 images, respectively. In order to
evaluate how the transfer learning can perform on the
dataset with coupled classes, transfer learning
networks are built using AlexNet and VGG16. The
1,000 classes classifier are replaced with a new
classifier. The newly added fully-connected layer has
a size of 16 to match the number of classes in Dataset Fig. 4. Training results of VGG16-based transfer
I. The images with various sizes are first resized to learning model.
suit the input size of the AlexNet and VGG16 input
layer. During the training process, the weights of the Table I: Classification accuracy on not mutually
transferred layers are frozen by specifying a much exclusive Dataset I (16 classes)
higher learning rate in the classifier than the VGG16- AlexNet- Conventional
transferred layers. The training is carried out for 6
based based CNN
epochs using sgdm solver. The training results of the
Validation
transferred models are shown in Fig. 3 and Fig. 4. The 0.5294 0.4613 0.1550
accuracy
classification accuracy on Dataset I is shown in Table
Test
I. For comparison, a conventional CNN is used as the 0.5069 0.4500 0.1613
accuracy
benchmark. It can be seen that the VGG16-based
transfer learning network has the best performance test set in the dataset I. The true classes of the two
among the three networks, which achieves a 0.5069 images are animals and health, respectively. For
accuracy on a dataset with coupled classes. The Image A, one can naturally reckon that the image
AlexNet-based model comes the second with an belongs to two classes, animals, and nature. For
accuracy of 0.4500. Both of the transfer learning AlexNet-based network, the image has been correctly
networks have far better test accuracy than the classified as animals with a score of 0.9002. While for
conventional CNN, of which the accuracy is only a VGG16-based network, even though it finally
poor 0.1613. It shows that the conventional CNN has classifies the image as nature with a 0.5174 score, it
majorly suffered from the not decoupled classes and still gives a fairly confident score (0.4579) in animals,
it is difficult to produce acceptable classification and no other classes get a score higher than 0.0150.
accuracy, while the transfer learning models can still As for Image B, the VGG16-based network classifies
yield credible results. it into food-drink with a score of 0.8249, but also give
Even though the classification accuracy of the two a score 0.1748 to the class health, and the scores for
transfer learning networks is not as high on Dataset I, the rest of the classes are all too small to be
the classification score for each class can still provide considered. It is of common sense that Image B can
an insight into what classes should the image belong be put into the food-drink and health classes.
to. Fig. 5 shows two example images taken from the

Authorized licensed use limited to: Frankfurt University of Applied Sciences. Downloaded on June 17,2022 at 15:30:12 UTC from IEEE Xplore. Restrictions apply.
approach increases dramatically, and still outperforms
the conventional CNN by a large margin.

Acknowledgment
This work was supported by Seoul National
University of Science and Technology, Seoul, South
(a) (b) Korea.
Fig. 5. Example images taken from the test set of the
dataset I. (a) Image A, label: animals. (b) Image B, References
label: health.
[1] J. Wang, Y. Zheng, M. Wang, Q. Shen and J. Huang,
Table II: Classification accuracy on mutually "Object-Scale Adaptive Convolutional Neural Networks for
exclusive Dataset II (2 classes) High-Spatial Resolution Remote Sensing Image
Classification," in IEEE Journal of Selected Topics in
VGG16- AlexNet- Conventional Applied Earth Observations and Remote Sensing, vol. 14,
based based CNN pp. 283-299, 2021.
Validation
0.9900 0.9650 0.7450 [2] J. Zhang, Y. Xia, Y. Xie, M. Fulham and D. D. Feng,
accuracy
Test "Classification of Medical Images in the Biomedical
0.9950 0.9650 0.7100 Literature by Jointly Using Deep and Handcrafted Visual
accuracy
Features," in IEEE Journal of Biomedical and Health
Informatics, vol. 22, no. 5, pp. 1521-1530, Sept. 2018.
In order to demonstrate the effectiveness of the
transfer learning on well-classified dataset, the [3] H. Lee and H. Kwon, "Going Deeper With Contextual
Dataset I is cropped to remove the overlapped classes CNN for Hyperspectral Image Classification," in IEEE
to construct Dataset II. The Dataset II contains only Transactions on Image Processing, vol. 26, no. 10, pp. 4843-
two mutually exclusive classes, to be specific, animals 4855, Oct. 2017.
and architecture. The Dataset II is also divided into
training set, validation set, and test set with a ratio of [4] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E.
3:1:1. Comparison of the classification accuracy of Hinton. "Imagenet classification with deep convolutional
neural networks." Advances in neural information
the VGG16-based network, the AlexNet-based
processing systems 25 (2012): 1097-1105.
network and the conventional CNN on Dataset II are
given in Table II. It can be seen that the classification [5] He, Kaiming, et al. "Deep residual learning for image
accuracy of the two transfer learning networks on recognition." Proceedings of the IEEE conference on
Dataset II with well-decoupled classes are both above computer vision and pattern recognition. 2016.
96%, which is much improved compared with that on
Dataset I with not mutually exclusive classes. [6] Howard, Andrew G., et al. "Mobilenets: Efficient
Besides, the two transfer learning networks still convolutional neural networks for mobile vision
outperform the conventional CNN (71% test applications." arXiv preprint arXiv:1704.04861 (2017).
accuracy) on Dataset II by a large margin.
[7] L. Shao, F. Zhu and X. Li, "Transfer Learning for Visual
Categorization: A Survey," in IEEE Transactions on Neural
4. Conclusion Networks and Learning Systems, vol. 26, no. 5, pp. 1019-
1034, May 2015.
This paper evaluates the performance of the deep
CNN-based transfer learning approach on a dataset [8] Hanni, Akkamahadevi, Satyadhyan Chickerur, and
with not mutually exclusive classes for image Indira Bidari. "Deep learning framework for scene based
indoor location recognition." 2017 International Conference
classification. Deep CNN has the major drawback that
on Technological Advancements in Power and Energy
it requires a large dataset to train the network, while (TAP Energy). IEEE, 2017.
conventional CNN has low accuracy if used on dataset
with coupled classes. Therefore, transfer learning [9] M. Fradi, M. Afif, E. -H. Zahzeh, K. Bouallegue and M.
based on the AlexNet and the VGG16 is proposed to Machhout, "Transfer-Deep Learning Application for
be used on this kind of dataset. Experimental results Ultrasonic Computed Tomographic Image Classification,"
show that when using the transfer learning approach 2020 International Conference on Control, Automation and
on the dataset with not mutually exclusive classes, it Diagnosis (ICCAD), Paris, France, 2020, pp. 1-6.
achieves acceptable accuracy, and the classification
[10] Swati, Zar Nawab Khan, et al. "Brain tumor
scores reflect the potential classes that the image
classification for MR images using transfer learning and
might belong to, while the conventional CNN fails fine-tuning." Computerized Medical Imaging and Graphics
with low accuracy. When the classes are decoupled in 75 (2019): 34-46.
the dataset, the accuracy of the transfer learning

Authorized licensed use limited to: Frankfurt University of Applied Sciences. Downloaded on June 17,2022 at 15:30:12 UTC from IEEE Xplore. Restrictions apply.

You might also like