0% found this document useful (0 votes)
7 views12 pages

Research Article: Deep Learning Model of Image Classification Using Machine Learning

This research article presents a deep learning model for image classification that addresses the inefficiencies and low accuracy of traditional machine learning methods. The authors propose an improved convolution neural network structure that enhances feature extraction and classification accuracy, demonstrating its effectiveness through experimental comparisons. The results indicate significant improvements in classification performance after optimizing the model, making it suitable for large datasets in various applications.

Uploaded by

Deval Mishra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views12 pages

Research Article: Deep Learning Model of Image Classification Using Machine Learning

This research article presents a deep learning model for image classification that addresses the inefficiencies and low accuracy of traditional machine learning methods. The authors propose an improved convolution neural network structure that enhances feature extraction and classification accuracy, demonstrating its effectiveness through experimental comparisons. The results indicate significant improvements in classification performance after optimizing the model, making it suitable for large datasets in various applications.

Uploaded by

Deval Mishra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Hindawi

Advances in Multimedia
Volume 2022, Article ID 3351256, 12 pages
https://doi.org/10.1155/2022/3351256

Research Article
Deep Learning Model of Image Classification Using
Machine Learning

Qing Lv, Suzhen Zhang , and Yuechun Wang


Shijiazhuang Posts and Telecommunications Technical College, Shijiazhuang 050021, China

Correspondence should be addressed to Suzhen Zhang; [email protected]

Received 26 May 2022; Revised 19 June 2022; Accepted 29 June 2022; Published 19 July 2022

Academic Editor: Qiangyi Li

Copyright © 2022 Qing Lv et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Not only were traditional artificial neural networks and machine learning difficult to meet the processing needs of massive images
in feature extraction and model training but also they had low efficiency and low classification accuracy when they were applied to
image classification. Therefore, this paper proposed a deep learning model of image classification, which aimed to provide
foundation and support for image classification and recognition of large datasets. Firstly, based on the analysis of the basic theory
of neural network, this paper expounded the different types of convolution neural network and the basic process of its application
in image classification. Secondly, based on the existing convolution neural network model, the noise reduction and parameter
adjustment were carried out in the feature extraction process, and an image classification depth learning model was proposed
based on the improved convolution neural network structure. Finally, the structure of the deep learning model was optimized to
improve the classification efficiency and accuracy of the model. In order to verify the effectiveness of the deep learning model
proposed in this paper in image classification, the relationship between the accuracy of several common network models in image
classification and the number of iterations was compared through experiments. The results showed that the model proposed in
this paper was better than other models in classification accuracy. At the same time, the classification accuracy of the deep learning
model before and after optimization was compared and analyzed by using the training set and test set. The results showed that the
accuracy of image classification had been greatly improved after the model proposed in this paper had been optimized to a
certain extent.

1. Introduction information. It is also of great significance for the expansion


of computer vision in the field of image processing appli-
Not only is using manual method to select image features cation. At present, the research of image classification by
time-consuming and laborious, but also the feature quality computer has been gradually applied to agriculture, in-
depends on professional knowledge and practical experience dustry, aerospace, and other fields. The research of image
to a certain extent, which limits the application of manual classification mainly includes efficient and accurate classi-
feature selection method [1]. With the advent of the era of fication of large-scale image information and accurate
big data and artificial intelligence, the traditional artificial classification of image semantic similar feature information.
way to obtain image information has been unable to meet In order to recognize and classify the image, we need to
the needs of applications in related fields. In recent years, the use the image acquisition tool to obtain the original image
method of obtaining images by computer has made some information, then clean the image data and extract and filter
progress, but, for the massive image information processing, the features, and finally use the machine learning algorithm
the traditional computer vision is still inefficient. For ex- to realize the image classification. With the deepening of the
ample, traditional machine vision methods have low ac- application of machine vision in many fields, people can use
curacy in detecting low-level features and high-level features computer vision technology to quickly realize the processing
of images [2]. Classification and processing of massive image operations of image information, such as noise reduction
information is the prerequisite for in-depth mining of image and feature extraction [3]. The research of image
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 Advances in Multimedia

classification based on machine learning has made some features can be used as the basis of image classification.
progress. From the adoption of artificial recognition to the Traditional image classification methods generally use single
continuous application of computer vision technology in feature extraction or feature combination and take the
image classification, scholars have made some achievements extracted features as the input value of support vector
in the research of image classification and recognition. machine. In recent years, some progress has been made in
Machine learning is an important branch of artificial image classification using artificial neural network classifier.
intelligence. Although machine learning has experienced In order to improve the accuracy of image classification, we
half a century of development, there are still some unsolved can focus on the standardized design of low-level features
problems, for example, complex image understanding and such as texture, shape, and color.
recognition, natural language translation, and recommen- Deep learning realizes the training of large-scale datasets
dation system [4]. Deep learning is an important branch through multilevel network model and adopts the method of
developed on the basis of machine learning. It makes full use layer-by-layer feature extraction to obtain the high-level
of the hierarchical characteristics of artificial neural network features of the image. Not only is the deep learning network
and biological neural system to process information and model used to extract the basic features of the image but also
obtains high-level features by learning low-level features and it can obtain the deep features of the image through multiple
adopting feature combination method, so as to realize image hidden layers. Compared with traditional machine learning
classification or regression. Different from traditional ma- methods, the features obtained by deep learning method are
chine learning, deep learning uses multilayer neural network not only accurate but also conducive to image classification.
to automatically learn the image and extract the deep-seated In the process of image recognition and classification, the
features of the image. Different depth learning models can be way of feature learning and combination is mainly deter-
formed according to different feature learning and its mined by the deep learning model [8]. At present, the
combination. However, the accuracy of image classification commonly used deep learning models are sparse model,
is not high and the operation efficiency of the existing deep restricted Boltzmann machine model, and convolution
learning model is low. +erefore, based on the existing basic neural network model. Although these models have some
theory of convolution neural network, this paper establishes differences in feature extraction, they have similarities in
a deep learning model of image classification by improving image classification and recognition. +ey all go through the
the structure of convolution neural network, which provides steps of image information input, data proprocessing, fea-
a basis for image classification under complex environ- ture extraction, model training, and classification output.
mental conditions. In image classification, some scholars have carried out a
lot of research work on image feature representation and
2. Related Works classifier selection. For example, the deep learning model
based on feature representation can be effectively applied to
Image classification provides an important basis for image the recognition and classification of various images. Some
depth processing and the application of computer vision scholars use deep convolution neural networks (DCN) to
technology in related fields. Traditional image classification deeply extract image features and apply them to large-scale
mainly goes through different stages, such as image pro- dataset ImageNet [9]. Experiments show that the model can
processing, feature extraction, classifier construction, and effectively classify large data image sets. In addition, the deep
learning training [5]. Traditional image classification learning model can effectively learn and describe image
methods mainly use the extracted basic image features to features. For example, the deep learning model can better
realize image classification, which can provide a basis for describe the hierarchical features through unsupervised
further obtaining the semantic information of images by learning, and the features extracted by the model not only
computer. Traditional image classification generally uses have strong expression ability but also improve the efficiency
image color, texture, and other information to calculate of image classification. A large number of research results
image features and uses support vector machine and logistic show that, with the deepening of the research of image
regression to realize image classification [6]. +e results of classification methods in related fields, deep learning model
image classification not only depend on the extracted fea- has gradually replaced the traditional artificial feature ex-
tures to a great extent but also are affected by the knowledge traction and machine learning methods and will be widely
and experience of relevant fields. concerned by scholars in the field of image recognition and
Not only are the manually acquired features difficult to classification [10].
apply to image classification but also a lot of time is spent in
analyzing feature data. At the same time, the traditional 3. Fundamentals of Image
machine learning cannot be applied to the processing of Classification Algorithm
large datasets, and it is difficult to realize the optimization of
feature design, feature selection, and model training, which 3.1. Basic eory of Neural Network. +e traditional neural
makes the classification effect of the model poor. +erefore, network, referred to as artificial neural network (ANN), is a
image classification methods using traditional machine hot spot in the field of early artificial intelligence [5]. Ar-
learning are affected in many application fields [7]. Research tificial neural network mainly uses the neurons of network
shows that because texture, shape, and color features can be model to abstract the characteristics of external things, so as
used for image classification and recognition, low-level basic to be used by computer to complete information processing.
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Advances in Multimedia 3

Artificial neural network generally establishes the corre- Hidden layer


sponding network structure according to the different
construction methods of neurons. Neural network is an
operation model composed of several different nodes or
neurons connected with each other. Each node in the model .
is a processing function, and the connection between dif- Input layer . Output layer
ferent nodes uses weight to represent the memory ability of .
artificial neural network. +e output of neural network
depends on the connection form, weight value, and exci-
tation function of different nodes. At the same time, the . . .
neural network model is mainly constructed according to . . .
some algorithm or function to express some specific logical . . .
operation.
A basic neural network model usually includes infor-
mation input layer, hidden layer, and calculation result
. . .
output layer. Different layers can contain several neurons
. . .
[11]. Neurons represent a transformation or operation, . . .
which is completed by the activation function of neurons.
Two adjacent layers of neurons are connected to each other,
as shown in Figure 1.
As can be seen from Figure 1, the neural network model .
includes 11 neurons: 3 input layers, 5 hidden layers, and 3 .
W1 W2
.
output layers. +e structure belongs to a two-layer neural
network model, where W1 and W2 are the weight matrices of
the hidden layer and the output layer, respectively.
Deep learning method is a part of machine learning. It is
widely used in natural language recognition and image Figure 1: Connections between neurons in different layers.
detection and classification. Moreover, deep learning comes
from the theory of artificial neural network. By referring to
the human brain for hierarchical processing of information, extraction. It not only solves the shortage of manual or
different levels of neural networks are established. Deep knowledge but also avoids the preference problem in feature
learning effectively extracts multilevel feature information extraction. At the same time, the deep learning model can
by simulating human brain, so as to obtain the key feature obtain representative high-level features through the organic
information of image, text, and other data. Deep learning combination of low-level features [12]. As shown in Figure 2,
mainly describes the specific object characteristics through the working processes of deep learning and traditional
hierarchical processing according to a large amount of edge machine learning are compared.
feature information. It is a process from low-level feature In the neural network model, the activation function is
extraction to high-level feature combination. As an im- used to perform nonlinear operation on the input data of
portant method of machine learning, deep learning is an neurons in order to extract effective feature information from
effective method to process big data and obtain abstract the original input data. Activation function is a nonlinear
features by using neural network model. function. With the increase of the number of layers of neural
Deep learning is a multilayer deep neural network network model, the most effective feature information can be
model. According to the connection law of human brain obtained after many iterations and data training.
neurons, the sample features are processed in different layers In order to realize the classification of features, Softmax
of the model, and the deep features of the sample data are function is often used as the activation function in the neural
obtained in turn. Similar to deep neural network, artificial network model and used in the output layer of the model
neural network belongs to hierarchical structure. +e model [13]. Softmax function is generally used as a classifier. +e
is composed of multilayer perceptron, including input layer, calculation formula is as follows:
hidden layer, and output layer. Different from the deep exp Rk 􏼁
neural network, the artificial neural network model only Sk � . (1)
􏽐nk�1exp Rk 􏼁
contains two to three layers of forward neural network, and
the number of neurons in each layer is small, so the pro- In the above formula, n is the number of neurons in the
cessing ability of large datasets is limited. Because the deep current layer and Rk is the nonlinear transformation value of
neural network model contains many layers and each layer the k-th neuron in this layer.
contains a large number of neurons, the neural network Softmax is a classifier that can output different feature
model not only can realize the abstract expression of data but categories. +e value of each neuron contained in Softmax
also has strong learning function. Compared with the tra- can be considered as the probability of the corresponding
ditional machine learning methods, the deep learning model category. +e calculation process of Softmax function is
does not need to rely on manual design and feature shown in Figure 3.
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 Advances in Multimedia

Image data Image data

Basic feature
extraction
Artificial feature
extraction

Multi-layer feature
extraction

Weight training

Weight training

Result output Result output

(a) (b)

Figure 2: Comparison between deep learning model and traditional machine learning algorithm. (a) Deep learning algorithm.
(b) Traditional machine learning algorithm.

+e basic structure of convolution neural network is shown


R1 S1 in Figure 4.
Convolution neural network generally includes three
different types of processing layers: convolution layer, pooling
R2 S2 layer, and full connection layer. Among them, the feature
extraction task is completed by the convolution layer, and the
pooling layer is used for feature mapping. +e full connection
layer is similar to the general neural network structure. All
Softmax nodes in this layer are not connected to each other but
function
completely connected to the nodes of the previous layer. In
Rk Sk addition, like other neural networks, convolution neural
networks also have data input layer and result output layer.
+e calculation task of convolution neural network is
mainly completed through the convolution layer, and the
convolution kernel in the convolution layer is the core of the
convolution neural network model. +e convolution layer
Rn Sn
uses convolution check to convolute the input image and
extract the characteristic information of the image. +e
Figure 3: Schematic diagram of Softmax function calculation. images processed by convolution operation will gradually
become smaller, and the pixels at the edge of the image have
little effect on the output results.
3.2. Basic eory of Convolution Neural Network. As shown in Figure 5, assuming that the original input
Convolution neural network (CNN) is a typical network image is a 3 × 3 matrix, the original image is convoluted
structure in deep learning model [14]. Different from tra- through a convolution check with a size of 2 × 2, and the
ditional machine learning, convolution neural network can corresponding feature map is output.
be better used for image and time series data processing, Generally, there is a strong correlation between adjacent
especially for image classification and language recognition. pixels in the image. Convolution kernel mainly extracts
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Advances in Multimedia 5

Image input

Input layer Convolution layer Pool layer

Full connection layer Output layer


Figure 4: Structure diagram of convolution neural network.

Image (3×3)

Convolution kernel (2×2) Feature output (2×2)


a1 a2 a3
a1+3a2+ a2+3a3+
1 3 5b1+7b2 5b2+7b3

b1 b2 b3

b1+3b2+ b2+3b3+
5 7
5c1+7c2 5c2+7c3

c1 c2 c3

Figure 5: Schematic diagram of convolution operation process.

features from the local area of the image and sends the to classify ordinary images. Because the LeNet-5 model is
extracted local features to the high level for integration pro- not deep enough, it cannot extract enough image features
cessing. Because the bottom feature of the image is independent during model training. +erefore, it cannot be applied to the
of its position, it can not only use the same convolution check classification of complex images.
to extract the relevant features but also reduce the number of +erefore, some people put forward the AlexNet model
parameters of the neural network through the shared weight based on the LeNet structure, applied the convolution neural
characteristic of the convolution kernel, so as to improve the network to the processing of complex images, and provided
training efficiency of the network model. a theoretical basis for the application of deep learning model
For complex images, in order to reduce the amount of in the field of computer vision [17]. +e network structure of
parameter training of the model, the pooling layer in AlexNet is shown in Figure 7.
convolution neural network can be used to reduce the size of AlexNet is a network structure with 8 layers. +e model
feature map. During pooling, the depth and size of the image includes 5 convolution layers and 3 full connection layers.
can remain unchanged. +e operation of pooling layer +e model uses the ReLU function as the activation function
generally includes max pooling and average pooling [15], as to avoid the gradient dispersion phenomenon caused by the
shown in Figure 6. large number of layers of the network model. In order to
reduce network training time, AlexNet uses multiple GPUs
for training. In order to suppress neurons with small re-
3.3. Convolution Neural Network Model. In the existing sponse ability, AlexNet uses LRN (local response normali-
image detection and recognition models, such as ResNet, zation) processing layer to establish a competition
Mask-RCNN, and Faster R-CNN models, they are usually mechanism for neurons, so as to make the value with large
based on common network models [16]. LeNet is the most response ability increase continuously, so as to increase the
basic convolution neural network model [9]. After trans- generalization ability of the model. In addition, in order to
forming the LeNet model, the LeNet-5 model is established prevent neurons from forward propagation and back
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6 Advances in Multimedia

Average pooling

3 4
3 2 6 5

3 5
2 5 3 2

1 4 4 6
5 6

2 5 2 8
5 8

Max pooling
Figure 6: Schematic diagram of pool layer operation type.

227×227×3

11×11 3×3 5×5 3×3

Max pool Max pool

Image input Conv1 Conv2

3×3 3×3 3×3 3×3


=
Max pool FC FC
Somax
Conv3 Conv4 Conv5 function

9216 4096 4096


Figure 7: Schematic diagram of AlexNet network model structure.

propagation, the dropout mechanism is adopted in the full Table 1: +e network structure of Vgg-16.
connection layer of AlexNet model, so that the output results
of all hidden layer neurons are 0, so as to avoid the complex Convolution
Convolution Characteristic
interaction between neurons. Layer type kernel
kernel size diagram size
At present, VggNet is a widely used deep convolution number
neural network model. Compared with other models, Input layer — 448 × 448 —
VggNet not only has better generalization ability but also can Convolution layer 3×3 448 × 448 128
Pool layer 3×3 448 × 448 —
be effectively used for the recognition of different types of
Convolution layer 3×3 224 × 224 256
images [11]. For example, convolution neural networks such Pool layer 2×2 112 × 112 —
as FCN, UNet, and SegNet are based on VggNet model. In Convolution layer 3×3 112 × 112 768
recent years, Vgg-16 and Vgg-19 networks have been Pool layer 2×2 56 × 56 —
commonly used for VggNet models [18]. +e network Full
structure of Vgg-16 is shown in Table 1. 4096 — —
connection layer
+e Vgg-16 network structure has 16 layers in total, Full connection
4096 — —
excluding the pooling layer and Softmax layer. +e con- layer
volution core size is 3 × 3, the pooling layer size is 2 × 2, and Full connection
1000 — —
the pooling layer adopts the maximum pooling operation layer
with step size of 2. Softmax classifier — — —
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Advances in Multimedia 7

Vgg-16 network uses convolution blocks instead of used to extract the image feature information, and the
convolution layers, in which each convolution block con- Softmax classifier is used to classify the features [14].
tains 2∼3 convolution layers, which is conducive to reducing When the improved convolution neural network model
the network model parameters. At the same time, Vgg-16 is used to classify images, it is very necessary to proprocess
network adopts ReLU activation function to enhance the the image such as noise reduction and grayscale, select a
training ability of the model. Although Vgg model has more certain number of training sets and test sets from the dataset,
layers than AlexNet model, the convolution kernel of Vgg and then take the training set as the input object of the model
model is smaller than that of AlexNet model. +erefore, the after unsupervised learning processing. Secondly, the hidden
number of training iterations of Vgg model is less than that layer of the noise reduction automatic encoder is used to
of AlexNet model. encode and decode the input object, and the processing
results are output to the sparse automatic encoder of the next
4. Image Classification Model Based on layer for normalization. +e data is trained layer by layer
Improved CNN through the hidden layer of sparse automatic encoder, and
finally the training results of sparse automatic encoder are
4.1. Improved Image Classification Model Framework. output to Softmax classifier. In order to improve the clas-
Because image classification algorithms are usually used in sification accuracy, gradient descent method can be used to
systems with high real-time requirements, image classifi- strengthen the training of classifier model parameters in
cation algorithms need to consider real-time performance. order to improve the performance of image classification
For complex neural network models, image classification depth learning model. Finally, the network model is verified
needs to consume a lot of time. +erefore, this paper by using the image test set, and the effectiveness of the image
simplifies VggNet model and takes it as the model basis of classification method is tested according to the classification
image classification. results output by the model.
Considering the distribution characteristics of datasets +e improved convolution neural network model can
used for model training, a typical dataset can be selected as overcome the problem that the traditional neural network is
the weight of the model to initialize the training dataset. only limited to some features in image classification.
When the model is pretrained and reaches a certain accu- +rough the normalization of sparse automatic encoder, the
racy, the number of nodes in Softmax layer is reduced by ten overfitting phenomenon of model in data processing can be
times, and then the dataset is used for weight training. avoided, and more abstract representative features can be
Considering that the data processed by the model may be obtained by using the hidden layer of sparse automatic
affected by various noises, a noise reduction automatic encoder to train the data layer by layer. +e improved model
encoder is added to the model to eliminate the noise in- adopts Softmax classifier, which can make the classification
terference, and the existing dataset is extended through the result closer to the real value. +e improved deep learning
data enhancement method to enhance the generalization network model is mainly divided into two stages: training
ability of the model. and testing. +e training stage is mainly used to build an
Considering that the image classification algorithm effective image classification model, and the testing stage is
needs to meet certain real-time performance, the corre- mainly to evaluate and analyze the model according to the
sponding image classification model is established and experimental classification results. Figure 9 shows the
optimized based on VggNet model. Among them, the al- workflow of the improved deep learning network model.
gorithm combining convolution neural network and noise
reduction automatic encoder can be used. Because there may
be overfitting problem in image classification, it can be 4.2. Image Classification Model Optimization. It is known
optimized by data enhancement. Compared with other al- from the existing research that the convolution neural
gorithms, this classification algorithm has certain general- network model can be optimized from the aspects of data
ization performance in the case of small amount of data. In enhancement and adjusting training methods, and the op-
addition, the algorithm also adds a noise reduction auto- timization of convolution neural network model is related to
matic encoder, which can effectively reduce the impact of the type of model used. For example, for the deep learning
data noise on the performance of the model, so as to ensure model with more layers, the training parameters can be
that the model has good generalization ability. Since the optimized.
improved algorithm is based on VggNet network, the According to the structure of convolution neural net-
training time of the model may increase [19]. +e improved work, convolution layer is used to extract features, and the
image classification algorithm is shown in Figure 8. size of convolution kernel determines the extraction quality
In order to solve the problem of automatic noise re- of image features. From the perspective of image compo-
duction of complex image structure, the normalized network sition, adjacent pixels can generally form the edge lines of the
of encoder is used for classification in this paper. Based on image. Several edge lines form the image texture, and the
the existing convolution neural network model, the noise image texture is combined into several local patterns. +e
reduction automatic encoder and sparse automatic encoder local pattern is the basic element of the image. +rough the
are organically combined, and the input original image convolution layer of the network model, different types of
information is normalized on the sparse automatic encoder. features can be extracted and the local pattern of the image
+en, the improved convolution neural network model is can be formed. When the convolution kernel is smaller,
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8 Advances in Multimedia

Data Feature Feature


Image data set Adjust Somax
preprocessing extraction classification

Depth feature
extraction

Data Training parameter


Image data set Noise reduction
preprocessing adjustment Output
classification
results

Figure 8: Working diagram of improved image classification algorithm.

Weight Hidden layer Weight update


initialization weight training forward propagation

Back propagation
fine tuning
Data
preprocessing

Are constraints met ?


N

Y
Training set

Network
model

Output
Data Get weight forward Testing and
classification
preprocessing propagation analysis
results

Figure 9: Working flow diagram of improved deep learning network model.

although the convolution layer extracts more features, there convolution layers can improve the classification accuracy of
may be overfitting problems in the model. On the contrary, the model.
the larger the convolution kernel is, the fewer features In order to improve the accuracy of image classification
extracted by the convolution layer are, and the worse the and recognition, the depth learning model proposed in this
image classification effect may be. +erefore, the reasonable paper needs to be optimized. Firstly, a smaller convolution
optimization of the size of convolution kernel can improve kernel is selected in the first convolution layer in order to
the accuracy of image classification. extract more image feature details. Secondly, the maximum
Because the convolution neural network model mainly pool sampling operation is adopted in the model to avoid the
extracts the image features layer by layer through different overfitting problem. As shown in Figure 10, the optimized
convolution layers, the number of convolution layers will image classification model consists of three convolution
affect the feature extraction quality of the model to a certain layers, in which the convolution kernel of each convolution
extent. Similar to the number of convolution kernels con- layer decreases in turn. At the same time, after each con-
tained in the convolution layer, the more convolution layers, volution layer, the features are processed by ReLU activation
the finer the features obtained by the model classifier, which function, and the generated features are used as the input of
may lead to overfitting phenomenon, while the less con- the maximum pooling layer. +e model adopts three full
volution layers, the coarser the features obtained by the connection layers, takes the processing result of the last full
model classifier, which may lead to the decline of image connection layer as the input of Softmax classifier, and then
classification accuracy. +erefore, the optimization of generates the classification result of the image.
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Advances in Multimedia 9

9×9
3×3 5×5 3×3 3×3 3×3

Image input Conv1 Max pool Conv2 Max pool Conv3 Max pool

Somax
function

FC3 FC2 FC1


Figure 10: Structure diagram of optimized image classification model.

+e convolution layer parameters of the model include the classification process, category y of the image target can
the size and number of convolution kernels. +e first have M different values. If the image training set is
convolution layer is close to the image input layer and is 􏼈(x1 , y1 ), · · · , (xi , yi )􏼉, where xi represents the image
mainly used to extract the basic features of the image, so the training sample, yi is the image classification category, and
parameters of the first convolution layer have a great in- yi ∈ {1, 2, · · · , M}, the cost function of Softmax regression
fluence on the features. In order to facilitate the further algorithm can be expressed as
processing of features in the subsequent convolution layer, a
T
smaller convolution kernel needs to be used to extract the 1 N M exp􏼐αk xi 􏼑
attribute information such as shadow, boundary, and light of r(α) � ⎡⎢⎣􏽘 􏽘 1􏼈yi � k􏼉log⎛
⎝ ⎠⎤⎥⎦.
⎞ (4)
N i�1 k�1 􏽐j�1 exp􏼐αTj xi 􏼑
M
the image.
+e convolution layer maps the obtained features Assuming that M markers are accumulated in the cost
through the activation function. +erefore, the optimized function, the probability calculation of training sample x as
convolution neural network model adopts ReLU activation category k can be obtained, which is expressed as
function [20], which can be expressed by mathematical
function as follows: exp􏼐αTk xi 􏼑
λ yi � k | xi ; α􏼁 � T
. (5)
y(x) � Max(0, x). (2) 􏽐M
j�1 exp􏼐αj xi 􏼑

When the traditional convolution neural network model


uses ReLU activation function to train features, it may lose
useful feature information in the process of image classifi- 5. Experiment and Analysis
cation. In order to prevent the loss of useful features during 5.1. Selection of Datasets and Experimental Methods. In order
image classification, it can be improved on the basis of the to verify the effectiveness of the image classification depth
existing ReLU activation function [20]. +e optimized ac- learning model proposed in this paper, the Flower dataset
tivation function can be expressed as provided by Oxford University was used in the experiment

⎧ xi [22]. +ese images describe the proportion, shape, and light

⎪ , xi < 0,
⎨ ci changes of different types of flowers, and the images of some
y xi 􏼁 � ⎪ (3) flower categories change greatly. +e dataset contains 17


⎩ categories of flower datasets; each category contains 80
xi , xi ≥ 0.
pictures, for a total of 1360 pictures. In the experiment, the
From the improved calculation formula of activation dataset is randomly divided into three parts as the training
function, when the input feature is less than zero, it can not set, verification set, and test set of the model. A partial image
only retain the negative value information in the feature map of the Flower dataset is shown in Figure 11.
but also increase the reinforcement learning of effective +e experiment is based on Matlab, Python, and deep
features. learning framework. +rough the proprocessing, feature
+e optimized convolution neural model uses Softmax extraction, training, and classification of Flower dataset, the
function to classify the images, and Softmax function uses effectiveness and accuracy of the model in image classifi-
supervised learning algorithm to regress the features [21]. In cation are verified.
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 Advances in Multimedia

Fritillary Dandelion Lily Valley Daisy Daffodil 0.8


0.7

Recognition accuracy
0.6
0.5
0.4
0.3
0.2
0.1
0
50 100 150 200 250 300 350 400 450 500
Iterations
LeNet
AlexNet
VggNet
Figure 12: +e relationship between the classification accuracy of
the model and the number of iterations.

Cowslip Tulip Tigerlily Crocus Bluebell


5.2. Results and Analysis. In order to facilitate comparative
analysis, three common network models, LeNet, AlexNet,
and VggNet, are selected for comparison in the experiment.
According to the experimental comparison results, the re-
lationship between the classification accuracy and the
number of iterations of the model when using different
network models to classify flower images is shown in Fig-
ure 12. When iterating 50 times, the accuracy of LeNet
model is 28%, that of AlexNet model is 17%, and that of
VggNet model is 48%. When the number of iterations is 100,
the accuracy of LeNet model is 45%, that of AlexNet model is
39%, and that of VggNet model is 72%. In addition, when the
VggNet model converges, the highest accuracy rate is as high
as 75%, the highest accuracy rate of LeNet model is 58%, and
the highest accuracy rate of AlexNet model is 41%.
In order to test the effect of image classification after the
Figure 11: partial image of the Flower dataset [22]. model is optimized to a certain extent, the accuracy of flower
image classification before and after the model optimization
is compared in the experiment, as shown in Figure 13. +e
In the experiment, the classification accuracy is used to comparison results show that, for the training dataset, the
evaluate the robustness of the deep learning network model. optimized model converges faster in the early stage of
According to different evaluation angles, classification ac- training and slower in the middle stage of training, while the
curacy generally includes overall accuracy and category two models in the later stage of training are basically the
classification accuracy [23]. +e overall accuracy AC is same. For the test dataset, the optimized model is higher
expressed by the ratio of the number of correctly classified than the nonoptimized model in terms of convergence speed
samples to the total number of samples, while the classifi- and image classification accuracy. +erefore, the model
cation accuracy CCAi is expressed by the ratio of the number optimization method proposed in this paper can effectively
of correctly classified samples to the number of such tests. improve the accuracy of image classification.
+e calculation formula is as follows: In addition, in order to test the relationship between the
t loss value function of the optimized model and the number
AC � r , of iterations, the training set and test set are used to compare
tn
(6) the models before and after optimization, as shown in
tri Figure 14. +e loss value function of the nonoptimized
CCAi � . model shows an upward trend with the increase of the
tni
number of iterations, indicating that the nonoptimized
In the above formula, tni represents the number of model has the phenomenon of overfitting, while the loss
correctly classified samples, tn denotes the number of test value function of the optimized model shows a downward
samples, tri indicates the number of correctly classified trend with the increase of the number of iterations. It can be
samples of type i, and tni expresses the number of test seen that the cost of parameter training can be reduced
samples of type i. through model optimization.
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Advances in Multimedia 11

1 the model, this paper optimized the proposed deep learning


0.9 model. Finally, the accuracy of several common network
0.8
Recognition accuracy

models in image classification was compared through ex-


0.7
0.6
periments. +e results showed that the proposed model was
0.5 better than other models in classification accuracy. At the
0.4 same time, the classification accuracy before and after the
0.3 optimization of the deep learning model was compared and
0.2 analyzed. +e results showed that the optimized model had a
0.1
0
great improvement in the accuracy of image classification.
0 10 20 30 40 50 60 70 How to classify dynamic targets in complex environment is a
Iterations problem that needs further research in the future.
Train_optimization Test_optimization
Data Availability
Train_no_optimization Test_no_optimization
Figure 13: Comparison of accuracy before and after model +e labeled dataset used to support the findings of this study
optimization. is available from the corresponding author upon request.

Conflicts of Interest
4 +e authors declare that there are no conflicts of interest.
3.5
3
Acknowledgments
Loss value

2.5
2 +is work was supported by the Shijiazhuang Posts and
1.5
Telecommunications Technical College.
1
0.5
0
References
0 10 20 30 40 50 60 70
Iterations
[1] S. H. Kim and H. L. Choi, “Convolutional neural network-
based multi-target detection and recognition method for
Train_optimization Test_optimization unmanned airborne surveillance systems,” INTERNA-
Train_no_optimization Test_no_optimization TIONAL JOURNAL OF AERONAUTICAL AND SPACE
SCIENCES, vol. 20, no. 4, pp. 1038–1046, 2019.
Figure 14: Iterative comparison of model loss value before and [2] P. W. Song, H. Y. Si, H. Zhou, R. Yuan, E. Q. Chen, and
after optimization. Z. D. Zhang, “Feature extraction and target recognition of
moving image sequences,” IEEE Access, vol. 8, pp. 147148–
147161, 2020.
From the above experimental comparison results of the [3] W. Y. Zhang, X. H. Fu, and W. Li, “+e intelligent vehicle
relationship between the accuracy of common network target recognition algorithm based on target infrared features
models in image classification and the number of iterations, combined with lidar,” Computer Communications, vol. 155,
it is known that the model proposed in this paper is superior pp. 158–165, 2020.
to other models in classification accuracy. By comparing the [4] M. Li, H. P. Bi, Z. C. Liu et al., “Research on target recognition
classification accuracy of the deep learning model on the system for camouflage target based on dual modulation,”
training set and the test set before and after optimization, it Spectroscopy and Spectral Analysis, vol. 37, no. 4, pp. 1174–
is known that the accuracy of image classification can be 1178, 2017.
significantly improved after a certain degree of optimization. [5] S. J. Wang, F. Jiang, B. Zhang, R. Ma, and Q. Hao, “Devel-
opment of UAV-based target tracking and recognition sys-
tems,” IEEE Transactions on Intelligent Transportation
6. Conclusion Systems, vol. 21, no. 8, pp. 3409–3422, 2020.
[6] O. Kechagias-Stamatis and N. Aouf, “Evaluating 3D local
Aiming at the problems of large time overhead and low descriptors for future LIDAR missiles with automatic target
classification accuracy in traditional image classification recognition capabilities,” e Imaging Science Journal, vol. 65,
methods, a deep learning model of image classification based no. 7, pp. 428–437, 2017.
on machine learning was proposed in this paper. By ana- [7] M. Ding, Z. J. Sun, L. Wei, Y. F. Cao, and Y. H. Yao, “Infrared
lyzing the basic theory of neural network, this paper target detection and recognition method in airborne photo-
expounded the types of convolution neural network and its electric system,” Journal of Aerospace Information Systems,
vol. 16, no. 3, pp. 94–106, 2019.
application in image classification. Using the existing con-
[8] W. L. Xue and T. Jiang, “An adaptive algorithm for target
volution neural network model, through noise reduction recognition using Gaussian mixture models,” Measurement,
and parameter adjustment in the feature extraction process, vol. 124, pp. 233–240, 2018.
an image classification depth learning model based on [9] F. Liu, T. S. Shen, S. J. Guo, and J. Zhang, “Multi-spectral ship
improved convolution neural network was proposed. In target recognition based on feature level fusion,” Spectroscopy
order to improve the classification efficiency and accuracy of and Spectral Analysis, vol. 37, no. 6, pp. 1934–1940, 2017.
6048, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1155/2022/3351256, Wiley Online Library on [21/11/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
12 Advances in Multimedia

[10] S. Razakarivony and F. Jurie, “Vehicle detection in aerial


imagery: a small target detection benchmark,” Journal of
Visual Communication and Image Representation, vol. 34,
pp. 187–203, 2016.
[11] O. K. Stamatis and N. Aouf, “A new passive 3-D automatic
target recognition architecture for aerial platforms,” IEEE
Transactions on Geoscience and Remote Sensing, vol. 57, no. 1,
pp. 406–415, 2019.
[12] L. Y. Ma, X. W. Liu, Y. Zhang, and S. L. Jia, “Visual target
detection for energy consumption optimization of unmanned
surface vehicle,” Energy Reports, vol. 8, pp. 363–369, 2022.
[13] Z. Geng, H. Deng, and B. Himed, “Ground moving target
detection using beam-Doppler image feature recognition,”
IEEE Transactions on Aerospace and Electronic Systems,
vol. 54, no. 5, pp. 2329–2341, 2018.
[14] Z. M. Guo, Y. Jiang, and S. H. Bi, “Detection probability for
moving ground target of normal distribution using an im-
aging satellite,” Chinese Journal of Electronics, vol. 27, no. 6,
pp. 1309–1315, 2018.
[15] Y. K. Bai, “Target detection method of underwater moving
image based on optical flow characteristics,” Journal of
Coastal Research, vol. 93, no. sp1, p. 668, 2019.
[16] W. Z. Wu, J. W. Zou, J. Chen, S. Y. Xu, and Z. P. Chen, “False-
target recognition against interrupted-sampling repeater
jamming based on integration decomposition,” IEEE Trans-
actions on Aerospace and Electronic Systems, vol. 57, no. 5,
pp. 2979–2991, 2021.
[17] L. L. Yu, Q. X. Yang, and L. M. Dong, “Aircraft target de-
tection using multimodal satellite-based data,” Signal Pro-
cessing, vol. 155, pp. 358–367, 2019.
[18] I. Mahmud and Y. Z. Cho, “Detection avoidance and priority-
aware target tracking for UAV group reconnaissance oper-
ations,” Journal of Intelligent and Robotic Systems, vol. 92,
no. 2, pp. 381–392, 2018.
[19] T. Yulin, S. H. Jin, G. Bian, and Y. H. Zhang, “Shipwreck
target recognition in side-scan sonar images by improved
YOLOv3 model based on transfer learning,” IEEE Access,
vol. 8, pp. 173450–173460, 2020.
[20] Z. M. Guo, Y. Jiang, and S. H. Bi, “Detection probability for
moving ground target of normal distribution using infrared
satellite,” Optik, vol. 181, pp. 63–70, 2019.
[21] S. Matteoli, M. Diani, and G. Corsini, “Automatic target
recognition within anomalous regions of interest in hyper-
spectral images,” Ieee Journal of Selected Topics in Applied
Earth Observations and Remote Sensing, vol. 11, no. 4,
pp. 1056–1069, 2018.
[22] M. E. Nilsback and A. Zisserman, “Automated flower clas-
sification over a large number of classes,” in Proceedings of the
Sixth Indian Conference on Computer Vision, Graphics &
Image Processing, pp. 722–729, Bhubaneswar, India, De-
cember 2008.
[23] C. Y. Zhang, B. L. Guo, N. N. Liao et al., “A public dataset for
space common target detection,” KSII TRANSACTIONS ON
INTERNET AND INFORMATION SYSTEMS, vol. 16, no. 2,
pp. 365–380, 2022.

You might also like