0% found this document useful (0 votes)
69 views

Image Classification Based On Transfer Learning of CNN

The document summarizes a study that used transfer learning with convolutional neural networks (CNNs) for image classification. Specifically, it first extracted histogram of oriented gradient (HOG) features from training images and used a support vector machine (SVM) for pre-classification. It then used the pre-classification results to train the weights of a CNN, creating a transfer learning model. This model was able to classify new images with improved accuracy of around 95% compared to traditional classifiers or CNNs alone, demonstrating around a 5% improvement in classification accuracy.

Uploaded by

Prateek Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Image Classification Based On Transfer Learning of CNN

The document summarizes a study that used transfer learning with convolutional neural networks (CNNs) for image classification. Specifically, it first extracted histogram of oriented gradient (HOG) features from training images and used a support vector machine (SVM) for pre-classification. It then used the pre-classification results to train the weights of a CNN, creating a transfer learning model. This model was able to classify new images with improved accuracy of around 95% compared to traditional classifiers or CNNs alone, demonstrating around a 5% improvement in classification accuracy.

Uploaded by

Prateek Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of the 38th Chinese Control Conference

July 27-30, 2019, Guangzhou, China

Image Classification Based on transfer Learning of Convolutional


neural network
Yunyan Wang1ˈChongyang Wang1, Lengkun Luo1, Zhigang Zhou1
1. Hubei University of Technology,Wuhan Hubeiθ430068
E-mail: [email protected]

Abstract: Aiming at the issue of timeliness and lack of partial image data in life, an algorithm, transfer learning which based on
convolutional neural network (CNN) is proposed, combining image histogram of oriented gradient (HOG) feature extraction
method and support vector machine (SVM) pre-classification method. Firstly, the HOG features of the training sample similar
to the attributes of the samples which to be classified are extracted, then the hog features of the training samples are imported
into the SVM classifier to get the pre-classification results. Finally, the pre-classification results are used as training samples to
train the transfer network of CNN for getting new transfer learning model, this model can be used to classify similar pre-
classification samples. The experimental results show that the classification accuracy of the five categories of elephants and
dinosaurs used in this paper is effectively improved, and the overall classification accuracy can reach 95%, compared with the
traditional classifier algorithm and convolutional neural network algorithm. The classification accuracy has been improved by
about 5%.
Key Words: Convolutional neural network, Transfer learning , Image classification, Support vector machine


Guandong Li et al [6] introduced the transfer network


1 Introduction proposed by Oquab into high-resolution image scene
In recent years, deep learning algorithms have been classification, and improved the classification accuracy
widely used in traditional RGB image classification and effectively. Jia gang et al. [7] applied transfer learning to
medical images with difficult feature extraction and medical image retrieval. Chu Jinghui et al [8] proposed to
synthetic aperture radar (SAR) and other image fields. refer to the CNN transfer network The breast tumor
Conventional deep learning algorithms, such as diagnosis system fully extends the application of transfer
convolutional neural networks, can solve a problem of learning. E. Rezende et al. [9] applied transfer learning to the
image classification and recognition, although they can classification of malware and achieved good results. C. Galea
obtain good classification accuracy, but rely too much on a and RA Farrugia [10] used transfer learning to identify the
large amount of data and a long time-consuming training identity of the suspect on the sketch.
time[]. How to maintain the validity of the deep learning For the problem that too few data sets in deep learning
algorithm with less samples is of great significance for the lead to over-fitting and local optimal solutions, transfer
classification and recognition of images. learning solves this problem to some extent [11-12]. However,
When the transfer learning algorithm came into the the problem of under-adaptation in transfer learning is often
researcher's vision, these problems were effectively difficult to solve.This paper proposes an Alexnet transfer
improved. Fuzhen Zhuang et al [1] made a detailed summary learning algorithm combined with support vector machine,
of the history of transfer learning algorithms before 2015, which can optimize the under-adaptation problem to some
including the application scenarios and classification of extent. In addition, compared with the traditional machine
transfer learning. Considering the application of transfer classification algorithm and the convolution depth learning
learning algorithms in machine learning algorithms, Zhijie algorithm, the algorithm reduces the training time and
Xu et al [2] proposed combining the adaboost algorithm, improves the training precision, and also solves the over-
multi-view learning, multi-source learning and transfer fitting problem of deep learning in the case of less samples.
learning, but the algorithm has not been applied in image 2 HOG feature
processing. In recent years, neural networks have caused a
wave of image processing, Sinno Jialin Pan et al. first The Histogram of Oriented Gradient feature was
proposed a more systematic transfer learning theory[3]. So far, proposed by the French researcher Dala in the pedestrian
transfer learning has been paid more attention by researchers detection test in 2005[13]. It’s main idea is that the appearance
of deep learning. In the context that convolutional neural and shape of the image can be described by the density of
networks widely used in image classification. To obtain less the gradient and the direction of the edge. Firstly, the target
training time, Maxime Oquab et al. [4] introduced the CNN image is divided into small cell units, and then the gradient
parameters to the transfer learning network, but it was or edge direction histogram of each pixel in the cell unit is
limited to conventional images. More scholars hope to collected. Finally, these histograms constitute the feature
introduce this algorithm to some data with limited number set[14]. The specific process is shown in Fig.1.
and complex features. Li Song et al [5] moved Sevakula's Firstly, the RGB images are converted to grayscale
DNN transfer. The network has been applied to SAR image images as the input image and the input image is normalized
target recognition, and remarkable results have been by Gamma correction method. Then a set of formula (1) is
achieved in high-latitude data feature classifications. used to calculate the gradient (size and direction) of each

7506
pixel of the image. The next, the images are divided into 6*6 ܹ ‫ כ‬ൌ σே ‫כ‬
௜ୀଵ ‫ݕ‬௜ ܽ௜ ܺ௜ (2-3)
‫כ‬ ‫כ‬ ‫כ‬
cells. ܽ ൣ‫ݕ‬௜ ሾሺ‫ ܹۃ‬ή ܺ௜ ‫ ۄ‬൅ ܾ ሻ െ ͳሿ൧ ൌ Ͳ (2-4)
ܾ ‫ כ‬ൌ ‫ݕ‬௜ െ ‫ כ ܹۃ‬ή ܺ௜ ‫ۄ‬ (2-5)
݂ሺܺሻ ൌ σே ‫כ‬
௜ୀଵ ‫ݕ‬௜ ܽ௜ ‫ ܺۃ‬ή ܺ௜ ‫ ۄ‬൅ ܾ
‫כ‬
(2-6)
It is difficult to find a corresponding classification
hyperplane at low latitude level for datasets. We introduce a
kernel function to convert the data into high-dimensional
৏മ ⚠ᓖമ
planes.
ሩ⇿њFHOOඇሩ
ᖂаॆ༴ 䇑㇇ở
ởᓖⴤᯩമ䘋㹼
⨶ ᓖ 㿴ᇊᵳ䟽Ⲵᣅᖡ

ᡰᴹⲴEORFN޵Ⲵ ሩ⇿њ䟽ਐEORFN
ⴤᯩമੁ䟿㓴ᡀ ඇ޵ⲴFHOOሩ∄
+RJ⢩ᖱੁ䟿 ᓖᖂаॆ

+RJ⢩ᖱⴤᯩമ ᯩੁởᓖമ

Fig. 2: Schematic diagram of the support vector machine


Fig. 1: Image HOG Feature Extraction Flowchart
which is conducive to finding more efficient classification
‫ܩ‬௫ ሺ‫ݔ‬ǡ ‫ݕ‬ሻ ൌ ‫ܪ‬ሺ‫ ݔ‬൅ ͳǡ ‫ݕ‬ሻ െ ‫ܪ‬ሺ‫ ݔ‬െ ͳǡ ‫ݕ‬ሻ (1-1)
planes. The SVM discriminant function is similar to a neural

௬ ሺ‫ݔ‬ǡ ‫ݕ‬ሻ ൌ ‫ܪ‬ሺ‫ݔ‬ǡ ‫ ݕ‬൅ ͳሻ െ ‫ܪ‬ሺ‫ݔ‬ǡ ‫ ݕ‬െ ͳሻ (1-2)


network in form. The output is a linear combination of M

ሺšǡ ›ሻ ൌ ඥ‫ܩ‬௫ ሺ‫ݔ‬ǡ ‫ݕ‬ሻଶ ൅‫ܩ‬௬ ሺ‫ݔ‬ǡ ‫ݕ‬ሻଶ (1-3) intermediate nodes, each intermediate node pair uses a
ீ೤ ሺ௫ǡ௬ሻ
Ƚሺšǡ ›ሻ ൌ –ƒିଵ ሺ ሻ (1-4) support vector, as shown in Figure 2.
ீೣ ሺ௫ǡ௬ሻ
In the above formula, x and y are pixel coordinates, and 4 Transfer learning
the gradient histogram of each cell is counted. Each 3*3 cells
Using the neural networks such as Alex-net and VggNet
constitute a block. Then all the cell feature vectors in the
for image classification can obtain better classification
block are connected in series to obtain the HOG
results. As shown in Figure 3, the AlexNet network input
characteristics of the block. The last step is to collect HOG
dataset type 227*227*3 RGB image is calculated by the
features from all overlapping blocks in the detection window
convolution layer, pooling layer, active layer, Dropout layer,
and combine them into the final feature vector for
fully connected layer and other large data to become 1000-
classification.
dimensional Vector to be classified. finally use the softmax
3 SVM classification classifier to achieve classification effect. But the AlexNet
and VggNet algorithms are too dependent on huge amounts
The basic idea of the support vector machine is to solve of data and require a lot of computation time. The emergence
the separation hyperplane that can divide the training data of transfer learning can solve this problem well.
set correctly and has the largest geometric interval [15]. In
order to get the optimal classifying hyperplane, it is actually
to solve a quadratic programming problem. The classical
solution method is the Lagrange multiplier method, and the
Lagrange equation is formula (2-1). In the formula, W is the
coefficient vector and b is the constant. Obtaining partial
derivatives for W and b yields formula (2-2), in which  ௜
and ௝ are the pixel coordinates of the image. Solving this
equation results in vector W*. The result is shown in formula
(2-3). The optimal a* is determined by the constraint
formula (2-4), a* and W* can be obtained by a quadratic
programming algorithm, and then a support vector Xi can be
selected to obtain the value of b*, as shown in formula (2-5).
The final optimal discriminant function is shown in formula
(2-6).

‫ܮ‬ሺǡ ƒǡ „ሻ ൌ ȁȁܹȁȁଶ െ σே ௜ୀଵ ܽ௜ ሼ‫ݕ‬௜ ሺ‫ܺۃ‬௜ ή ܹ‫ ۄ‬൅ ܾሻ െ ͳሽ
ଶ Fig.3 :AlexNet network structure
(2-1)
ଵ ே
ƒš ሺƒሻ ൌ σே ܽ
௜ୀଵ ௜ െ σ σ ே
‫ݕ‬ ‫ݕ‬ ܽ ܽ ‫ܺۃ‬௜ ή ܺ௝ ‫ۄ‬ The purpose of transfer learning is to transfer information
ଶ ௜ୀଵ ௝ୀଵ ௜ ௝ ௜ ௝
(2-2) between the relevant source and target domains. The main

7507
idea is to use a training model parameter obtained from a (one-to-many) and Hierarchical Support Vector Machines.
large amount of training data to classify a small amount of This experiment adopts a one-to-one method that any An
test data and obtain an ideal classification accuracy. The SVM is designed between two types of samples. When
main process of transfer learning is as follows: classifying an unknown sample, the category with the largest
(1) Use Alexnet to train a large number of source domain number of times that is finallyclassified into a certain class
data to save the best model of classification effect. is the unknown sample class.
(2) Transfer the best model and weight parameters. In the last three layers of the transfer network, the full
(3) Use the model after the transfer to test the data under connection layer sets the Weight_Learn _Rate_Factor
the target domain. (weighted learning factor) and the Bias_Learn_Rate_Factor
The specific process is shown in Figure 4. First, use (base learning factor) to 20. In order to improve the training
Alexnet to train the Imagenet dataset, obtain the weight speed, the training related parameters are set as shown in
parameters through a series of convolution pooling Table 1.
processes, and save the model after obtaining the optimal
results. Secondly, the obtained optimal model is migrated to Table 1: Transfer Network Control Parameters
the data with less sample set for training. After fine-tuning Attribute澳 Parameter澳
the model, it can be used to test the classification of the MiniBatchSize澳 10澳
sample. MaxEpochs澳 6澳
InitialLearnRate澳 1e-4澳
ValidationFrequency澳 3澳
ValidationPatience澳 Inf澳

In the experimental network, the pooling layer adopts the


max-pooling method that the error of the feature extraction
of the feature points in the neighborhood mainly comes from
the increase of the variance of the estimation value caused
by the restricted size of the neighborhood and the
convolution parameter error caused by an offset in the
estimated mean. In general, mean-pooling can reduce the
first type of error, more background information of the
retained image, max-pooling can reduce the second type of
Fig. 4: Transfer learning process
error, and retain more texture information. Therefore, max-
pooling is used in the transfer network.
5 Experiment
5.3 experimental results
5.1 Experiment Preparation In this experiment, the images obtained after extracting
Hog features from five species are shown in Fig. 5. The
In order to quickly build the CNN network, Experiment
white points in the image are directional gradients. Finally,
selected the current popular deep learning framework that
10 images are randomly selected from the test set and their
Caffe. In the experiment, in order to speed up the training
labels are filled after identification. Fig.6 shows the
speed of the network, a single GPU computing mode was
corresponding test images. Above identifies the predicted
adopted. Transfer learning chose matlab2017a to directly
category. For this experimental sample, AlexNet and
invoke the AlexNet network and trained a single GPU
Hog+SVM were used for comparison experiments. The
acceleration mode for training.
classification accuracy confusion matrix for this experiment
The experimental scenario assumes that there are 50
is shown in Table 2.
special image data to be classified, but the amount of image
data of similar attributes is limited. At this time, the transfer
learning model obtained by transferring the parameters of
the similarity attribute large data set training model is
referred to. For the purpose of this scenario, 100 random
samples of 5 types of samples were collected.
5.2 Experimental parameter settings
When extracting a Hog feature, because of the extracted
image has a size of 256*256,the size of the Hog cell (block)
is set to [8, 8] each time. To capture large-scale spatial
information, the element size can be increased. The
transformation is moderate, Block Size (unit) is set to [2,2],
in order to ensure a shorter extraction time, the length of Hog
features is moderate. NumBins (direction histogram
segmentation) is set to 9, the range of directions taken in the
direction histogram is [-180,180]. Support Vector Machine
is a binary classifier. There are three methods for multi-
Fig. 5: HOG features
classification: OVR SVMs (one-to-many), OVO SVMs

7508
Table 5: AlexNet classification results
AlexNet

result dinosa elepha


label
bus
ur nt
flower horse

bus 100% 0.0% 0.0% 0.0% 0.0%


dinosaur 30% 20% 10% 40% 0.0%
elephant 20% 0.0% 40% 15% 25%
flower 0.0% 0.0% 50% 40% 10%
horse 0.0% 0.0% 0.0% 0.0% 100%

5.4 Experimental Analysis


It is evidently shown from Table 6 that the training sample
of this experiment was directly imported into Alexnet for
Fig. 6: Results deep learning resulting overfitting phenomenon. The main
reason is that training data is too small, and training time is
The classification accuracy confusion matrix for the longer. To extract image Hog features and use support vector
Hog+SVM is shown in Table3. And the classification machines to do classification experiments. Although the
accuracy confusion matrix for the Alexnet is shown in Table classification time is very short, it is mainly attributed to the
5. The table can clearly show the classification accuracy of simple feature extraction and the classification index is
various categories in different experimental environments relatively simple. It can be imagined that the classification
and the results of misclassification, effectively measuring accuracy is not ideal. Compared to these two experiments,
the effectiveness of the experiment. The classification time the transfer network combined with SVM and Alexnet is just
of the three experiments is shown in Table 4, including the right.
training time and test time. The training process of
transferring the network is shown in Table 6, in which the Table 6: Transfer Learning Network Learning Rate and Loss
transformation process of the learning rate and the loss rate Rate Transformation
during the training process is recorded.

Table 2: Alexnet-tranfer+svm classification results


Alexnet-tranfer+svm

result dinosa eleph


label
bus
ur ant
flower horse

bus 100% 0.0% 0.0% 0.0% 0.0%


dinosaur 0.0% 85% 10% 0.0% 5.0%
elephant 0.0% 0.0% 90% 0.0% 10%
flower 0.0% 0.0% 0.0% 100% 0.0%
horse 0.0% 0.0% 0.0% 0.0% 100%

Table 3: Hog+svm classification results


Hog+svm (1) The transfer learning network avoids the over-fitting
phenomenon. It can be used for a small number of training
result dinosa elepha sets and meet certain classification accuracy requirements.
label
bus
ur nt
flower horse
(2) Compared with the deep learning algorithm, the
bus 85% 0.0% 5.0% 5.0% 5.0%
transfer learning algorithm refers to the weight parameters
dinosaur 30% 20% 10% 40% 0.0%
in the deep learning process, thereby reducing the training
time. From a few hours of training time to several minutes,
elephant 20% 0.0% 40% 15% 25%
this algorithm improving the integration of this algorithm
flower 0.0% 0.0% 50% 40% 10%
and Applicability.
horse 0.0% 0.0% 0.0% 0.0% 100%
(3) Compared with the traditional classifier algorithm, the
transfer learning algorithm is generally superior to the
Table 4: Experiment classification time
general results in the classification results, and the overall
Project AlexNet-tranfer+svm Hog+SVM AlexNet accuracy is also improved by about 5%, which fully
Time 2m 17.56 s 8h demonstrates its effectiveness and observability.

7509
6 Conclusions and future prospects [7] Guandong Li, Chunju Zhang, Mingyu Wang, Xueying Zhang.
High-resolution image scene classification learning with
This paper proposes an Alexnet-based transfer learning convolutional neural network transfer [J/OL]. Science of
algorithm combined with the svm algorithm to avoid Surveying and Mapping, 2019(06): 1-13
over-fitting caused by too little training data set. The [8] Jia gang, wang zongyi. Application of hybrid transfer
algorithm reduces the training time as much as possible learning method in medical image retrieval [J]. Journal of
while improving the accuracy. In addition, the deep Harbin Institute of Technology, 2015, 36(7):938-942.
[9] Jinghui Chu, Zerui Wu, Wei Lu, Zhe Li. Mammary tumor
learning model used in this algorithm can be optimized
diagnosis system based on transfer learning and deep
and updated according to the development of the deep convolutional neural network [J/OL]. Progress in Laser and
learning network. Based on the excellent performance of Optoelectronics: 1-12[2018-05-03]
the algorithm under the small sample training set, the [10] E. Rezende, G. Ruppert, T. Carvalho, F. Ramos and P. de
algorithm should have good applicability in medical Geus, "Malicious Software Classification Using Transfer
image analysis and regional detection. However, if we Learning of ResNet-50 Deep Neural Network," 2017 16th
want to apply this algorithm to the actual situation, we IEEE International Conference on Machine Learning and
also need to solve the problems of under-adaptation and Applications (ICMLA), Cancun, 2017, pp. 1011-1014.doi:
negative transfer in transfer learning, so as to improve the 10.1109/ICMLA.2017.00-19
[11] C. Galea and RA Farrugia, "Matching Software-Generated
generalization ability of the transfer model in new
Sketches to Face Photographs With a Very Deep CNN,
situations. Morphed Faces, and Transfer Learning," on IEEE
Transactions on Information Forensics and Security, vol. 13,
References no.6.Pp.1421-1431,June2018.doi:
10.1109/TIFS.2017.2788002
[1] D. Cheng, Controllability of switched bilinear systems, IEEE [12] Danfeng Liu, Jianxia Liu. Neural Network Model for Deep
Trans. on Automatic Control, 50(4): 511–515, 2005. Learning Over-Fitting Problem[J]. Journal of Natural Science
[2] Fuzhen Zhuang, Ping Luo, Qing He, Zhongzhi Shi. Research of Xiangtan University, 2018, v.40; No.145(02):100-103.
progress in transfer learning[J]. Journal of Software, 2015, [13] Li Tao, Wei Yang, Wei Yang. Research on Model
26(01): 26-39. Construction and Over-Fitting of Deep Learning[J]. Journal
[3] Zhijie Xu. Research on Transfer Learning Theory and of Computer, 2018.
Algorithm [D]. East China Normal University, 2012. [14] Dalal N, Triggs B. Histograms of oriented gradients for
[4] Pan S J , Yang Q . A Survey on Transfer Learning[J]. IEEE human detection[C]// IEEE Computer Society Conference on
Transactions on Knowledge and Data Engineering, 2010, Computer Vision and Pattern Recognition. IEEE, 2005:886-
22(10):1345-1359. 893.
[5] M. Oquab, L. Bottou, I. Laptev and J. Sivic, "Learning and [15] Geismann P , Schneider G . A two-staged approach to vision-
Transferring Mid-level Image Representations Using based pedestrian recognition using Haar and HOG
Convolutional Neural Networks," 2014 IEEE Conference on features[C]// Intelligent Vehicles Symposium, 2008 IEEE.
Computer Vision and Pattern Recognition, Columbus, OH, IEEE, 2008.
2014, Pp. 1717-1724.doi: 10.1109/CVPR.2014.222 [16] Shi-Fei D , Bing-Juan Q I , Hong-Yan A T . An Overview on
[6] Song Li, Zhonghao Wei, Bingchen Zhang, Wen Hong. SAR Theory and Algorithm of Support Vector Machines[J].
target recognition in deep learning convolutional neural Journal of University of Electronic Science & Technology of
network[J]. University of Chinese Academy of Sciences, China, 2011, 40(1):2-10.
2018, 35(01): 75-83.

7510

You might also like