Paper 3
Paper 3
a r t i c l e i n f o a b s t r a c t
Article history: In today’s competitive conditions, producing fast, inexpensive and reliable solutions are an objective for
Received 25 November 2019 engineers. Development of artificial intelligence and the introduction of this technology to almost all
Revised 6 January 2020 areas have created a need to minimize the human factor by using artificial intelligence in the field of
Accepted 23 January 2020
image processing, as well as to make a profit in terms of time and labor. In this paper, we propose an
Available online 1 February 2020
automated butterfly species identification model using deep neural networks. We collected 44,659
images of 104 different butterfly species taken with different positions of butterflies, the shooting angle,
Keywords:
butterfly distance, occlusion and background complexity in the field in Turkey. Since many species have a
Butterfly
Deep learning
few image samples we constructed a field-based dataset of 17,769 butterflies with 10 species.
Classification Convolutional Neural Networks (CNNs) were used for the identification of butterfly species.
ResNet Comparison and evaluation of the experimental results obtained using three different network structures
Transfer learning are conducted. Experimental results on 10 common butterfly species showed that our method success-
fully identified various butterfly species.
Ó 2020 Karabuk University. Publishing services by Elsevier B.V. This is an open access article under the CC
BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
https://doi.org/10.1016/j.jestch.2020.01.006
2215-0986/Ó 2020 Karabuk University. Publishing services by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
190 A.S. Almryad, H. Kutucu / Engineering Science and Technology, an International Journal 23 (2020) 189–195
linear classifier (MLSLC), nearest mean classifier (NMC) and deci- 2. Materials and methods
sion tree (DT). Their experimental results tested on images col-
lected from actual field trapping for training yielded the 2.1. Dataset
classification rate of 86.6%.
Xie et al. [6] improved a learning model for the classification of In this study, we gathered 44,659 images in different 104
insect images using advanced multiple-task sparse representation classes from the website of Butterflies Monitoring & Photography
and multiple-kernel learning techniques. They tested the proposed Society of Turkey [18]. We developed a C# application to download
model on 24 common pest species of field crops and compared it all butterfly images, each image was saved to a directory named its
with some recent methods. Feng et al. [7] improved an insect spe- genus. When images were downloaded from the website, we
cies identification and retrieval system based on wing attributes of labeled the species which were approved by professionals. The
moth images. Their retrieval system is based on CBIR architecture dataset was divided into 104 different classes using these informa-
which is not about having a final answer but just providing a tion labels. Since many classes have a few image samples, we
match list to the user [8]. The final decision can be made by an restricted it to include 10 classes with the most sample size as
expert. shown in Table 1. The new dataset has 17,769 images. The number
Yao et al. [9] designed a model to automate rice pest identifi- of classes has been reduced to 10 in order to increase the classifi-
cation using 156 features of pests. The authors tested the pro- cation accuracy and to remove the classes with fewer samples.
posed model on a few species of Lepidoptera rice pests with Since all images on the website are of different resolutions, they
middle size using the support vector machine classifier. Another were resized to 224 224 pixels. Sample images for each class
study for pest recognition was done by Faithpraise et al. using are given in Fig. 1. Since the input images were captured from wide
k-means clustering algorithm and correspondence filters [10]. field-of-view cameras, they have some characteristics such as
Leow et al. [11] developed an automated identification system occlusion and background complexity. We deleted the images if
of copepod specimens by extracting morphological features and they contain more than one butterfly even if they are of the same
using ANN. They used seven copepod features of 240 sample species. We also deleted the images with butterfly larvae and
images and estimated an overall accuracy of 93.13%. Zhu and caterpillars.
Zhang [12] proposed a method to classify lepidopteran insect
images by integrated region matching and dual-tree complex
2.2. Deep neural networks
wavelet transform. They tested their method on a database con-
sisting of 100 lepidopteran insects of 18 families and estimated
Deep learning, especially in recent years, has become key
the recognition accuracy of 84.47%.
research for applications based on artificial intelligence [19]. Due
Mayo and Watson [13] showed that data mining techniques
to the serious achievements in the field of computer vision [20],
could be effective for the recognition of species. They used WEKA
natural language processing [21] and speech recognition [22], the
which is a data mining tool with different classifiers such as Naïve
rate of use increases day by day.
Bayes, instance-based learning, random forests, decision trees, and
Deep learning algorithms based on Artificial Neural Networks,
support vector machines. WEKA was able to achieve an accuracy of
inspired by a simplification of neurons in a human brain, come
85% using support vector machines to classify live moths by spe-
to the forefront with its success in the learning phase. Deep learn-
cies. Another work for identification of live moths was conducted
ing algorithms can solve the problem of feature extraction and
by Watson et al. [14] who proposed an automated identification
selection by automatically removing the distinguishing features
tool named DAISY.
from the given input data. Much more labeled data is needed for
Silva et al. [15] aimed to investigate the best combination of
deep learning compared to classic neural networks. The rapid
feature selection techniques and classifiers to identify honeybee
increase of achievable data today has made the role of deep learn-
subspecies. They found the best pair as the combination of Naïve
ing very important in problem-solving. These remarks have
Bayes classifier and the Correlation-Based feature selector in their
attracted the attention of many researchers in the field of com-
experimental results among seven combinations of feature selec-
puter science.
tors and classifiers.
Convolutional Neural Networks (CNNs), regarded as the funda-
Wang et al. [16] designed an identification system of insect
mental architecture of deep learning, is a Multi-Layer Perceptron
images at the order level. They extracted seven features from
(MLP) forward-feed neural network inspired by the vision of ani-
insect images. However, the authors manually removed some
mals [23]. CNNs are deep learning models which are mainly used
attached to insects such as pins. Their method has been tested
to image classification, similarity detection and object recognition
on 225 specimen images from nine orders and sub-orders using
[24,25].
artificial neural network with an accuracy of 93%.
CNNs has ability to identify Face [26], Person [27], Sign [28], etc.
Abeysinghe et al. [17] introduced a fully automated framework
CNNs, which focus mostly on image classification, are now used in
to identify snake species using convolutional Siamese networks.
Although the current dataset of snakes is small, they achieved sat- Table 1
isfactory results. Name and the number of images per butterfly’s genus in the
In this paper, we create a dataset of butterfly pictures taken dataset.
from the nature. After applying some preprocesses to the images Genus Images
of butterflies we used deep convolutional neural networks based
Polyommatus 5559
on VGG (VGG16 and VGG19) and ResNet (ResNet-50) Lycaena 2472
architectures. Melitaea 2235
The article is organized as follows: The field butterfly image Pieris 1530
dataset is discussed and a technical approach based on the differ- Aricia 1080
Pyrgus 1036
ent convolutional neural network architectures is introduced in Plebejus 1003
Section 2. Section 3 shows the experimental results and their dis- Nymphalis 989
cussion and finally, Section 4 concludes the paper and presents fur- Melanargia 943
ther works. Coenonympha 922
A.S. Almryad, H. Kutucu / Engineering Science and Technology, an International Journal 23 (2020) 189–195 191
almost every area requiring classification The general CNN struc- where p is padding size and s is the stride number. The pooling layer
ture consists of several consecutive convolutions and pooling lay- is used to reduce the size of the input matrix for the next convolu-
ers, one or more fully connected layers and in the end the output tion layer. By pooling, wxh-sized windows are progressed by a cer-
layer (the softmax layer) for classification. A sample CNN structure tain step to create a new image by taking the maximum value (max-
is given in Fig. 2. pooling) or the average value (average-pooling) of the window. By
In the input layer, images are taken to model as input data. The keeping the channel number constant, the width and height can
structure of this layer is quite important in terms of the success be reduced. Max pooling and average pooling are common two
and resource cost of the designed model. If it is necessary, some approaches in the pooling layer. An example of max pooling and
preprocessing algorithms such as scaling and noise reduction are average pooling is shown in Fig. 4 for an input matrix of size
applied to the input data. If an input has low-resolution images, 4 4 and a pooling filter of size 2 2.
then it can cause a decrease in depth and performance of the net- The fully connected layer comes after successive convolution
work. In the convolution layer, known as a transformation layer, and pooling layers. In the fully connected layer, the data from
convolution process is applied to the input data through filters as the pooling layer is reduced to a single dimension. Since each neu-
feature selection. Filters can be preset as well as randomly gener- ron is connected, it is called fully connected. In this layer, the clas-
ated. The result of the convolution process creates a feature map sification process is carried out and activation functions such as
of the data. The filter size is set as 1 1, 3 3, 5 5 and some- ReLU and Sigmoid are used. In the output layer, using the Softmax
times 7 7. Fig. 3 shows an application process of the sobel filter function, it produces the probability-based loss value using the
to sample image data. This process continues to overall images for score values generated by the model. Fig. 5 shows the output layer
all filters. Finally, features are obtained at the end of this process. If for a two-class problem.
the input images have 3 channels (RGB), convolution process is The dropout layer used in cases where the model performs
applied to each channel. over-fitting (memorization) makes the process of eliminating some
The padding operation determines which process is to be per- connections in the model that make over-learning. Therefore, the
formed for the pixel information to be added to the corners of network is prevented from memorizing by deleting some neurons
the input matrix. The stride process shows how many steps the in the fully connected layers at a random rate (usually 0.5).
window will be shifted in each step. After convolution process, The classic CNN architecture consists of two basic parts. The
the dimension of the matrix that is obtained by applying fxf sized first part is the convolution and pooling layers. Features are
filter to the image of nxn size is calculated by the following extracted on the input images in this part [29]. In the first part, fea-
equation. ture extraction stages are applied in the general sense, not
problem-specific. In the second part, the fully connected layer
n þ 2p f
nx ; ny ¼ þ1 ð1Þ and the output layer perform problem-specific classification.
s
2.3.1. VGGNet
Simonyan and Zisserman developed a new convolutional neural
network architecture, the so-called VGG in [31]. The architecture of
the VGG network is shown in Fig. 6. They released two best-
performing VGG models with 16 and 19 wt layers. 16 and 19 stand
for the number of weight layers in the VGG networks. VGG16 with
13 convolutions and 3 fully connected layers has achieved %8.8
top-5 test error on ImageNet dataset. Besides, VGG19 has 16 con-
volution and reached up to the %9.0 top 5-test error rate. In the
VGG network, the input is a fixed-size 224 224 RGB image.
2.3.2. ResNet
ResNet has a different structure than the traditional consecutive
network architectures such as VGGNet, AlexNet, because it has a
micro-architectural module structure that differs from other archi-
tectures. It may be preferable to switch to the lower layer by ignor-
ing the change between some layers. This situation is allowed in
the architecture of ResNet and the success rate of the network is
Fig. 5. Softmax function. increased by eliminating the problem of memorizing the network.
A.S. Almryad, H. Kutucu / Engineering Science and Technology, an International Journal 23 (2020) 189–195 193
ResNet architecture has a network of 177 layers. In addition to this learning transfer methods were used (VGG16, VGG19, ResNet50).
layered structure, there is information about how inter-layer con- The most widely used models in the literature have been trained
nections will occur. This model has trained for images of size on the ImageNet dataset and have achieved high success. In this
224 224 3. Fig. 7 shows an example for the connection used study, fine-tuned transfer learning methods are used for the classi-
in the ResNet architecture. fication of the butterfly images.
900 sample images were used for each class given in Table 1. achievement (80%) and does not increase afterward. A similar char-
Each class was divided into two parts automatically: training part acteristic pattern can be seen in the loss graph. Since no prepro-
(80%) and testing part (20%). Since training and testing data could cessing has been performed on the input images, the success is
be changed in each run, the training was performed five times for satisfactory considering the problems of the position of butterflies,
each dataset in the models, average accuracy values for 100 epoch the shooting angle, butterfly distance, occlusion and background
and average loss values were given in Figs. 8–10. The models were complexity. Fig. 10 shows the accuracy and model loss graphs of
implemented by using the Python programming language and the ResNet architecture. It is seen that up to 85% success achieved dur-
Keras library. Deep learning studies have been run on the GPU ing the training phase where the ResNet architecture had the prob-
because they require high memory. Hardware specifications of
the computer where the CNN models were trained and tested are:
Table 2
Model parameters.
Table 3
Comparison of CNN models.
lem of overfitting (memorization) on this input dataset could not [11] L.K. Leow, L.-L. Chew, V.C. Chong, S.K. Dhillon, Automated identification of
copepods using digital image processing and artificial neural network, BMC
be achieved during the testing phase.
Bioinf. 16 (2015) S4, https://doi.org/10.1186/1471-2105-16-S18-S4.
[12] L.-Q. Zhu, Z. Zhang, Insect recognition based on integrated region matching
4. Conclusions and dual tree complex wavelet transform, Zhejiang Univ. Sci. C 12 (2011) 44–
53, https://doi.org/10.1631/jzus.C0910740.
[13] M. Mayo, A.T. Watson, Automatic species identification of live moths, Knowl.-
In this paper, a field-based dataset was created using butterfly Based Syst. 20 (2007) 195–202, https://doi.org/10.1016/j.knosys.2006.11.012.
images which are classified by expert entomologists taken from [14] A.T. Watson, M.A. O’Neill, I.J. Kitching, Automated identification of live moths
(Macrolepidoptera) using digital automated identification System (DAISY),
nature. The input images of butterflies are classified with deep Syst. Biodivers. 1 (2004) 287–300, https://doi.org/10.1017/
learning architectures without using any feature extraction S1477200003001208.
method. Transfer learning was carried out using pre-trained mod- [15] F.L.d. Silva, M.L. Grassi Sella, T.M. Francoy, A.H.R. Costa, Evaluating
classification and feature selection techniques for honeybee subspecies
els. Comparison and evaluation of the experimental results
identification using wing images, Comput. Electron. Agric. 114 (2015) 6877,
obtained using three different network structures are conducted. https://doi.org/10.1016/j.compag.2015.03.012.
According to the results, the highest success was achieved by [16] J. Wang, C. Lin, L. Ji, A. Liang, A new automatic identification system of insect
images at the order level, Knowl.-Based Syst. 33 (2012) 102–110, https://doi.
VGG16 architecture. Although the images have some problems
org/10.1016/j.knosys.2012.03.014.
such as the position of butterflies, the shooting angle, butterfly dis- [17] C. Abeysinghe, A. Welivita, I. Perera, Snake image classification using Siamese
tance, occlusion and background complexity, approximately 80% networks, in: Proceedings of the 2019 3rd International Conference on
success was achieved for both test and training data. In conclusion, Graphics and Signal Processing, 2019, pp. 8–12, https://doi.org/10.1145/
3338472.3338476.
we observed that the transfer learning approach can be [18] http:// adamerkelebek.org, Butterflies Monitoring & Photography Society of
successfully applied in nature images. Our future research includes Turkey, 2019.
employing a mobile application by using the proposed pre-trained [19] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (2015) 436–444,
https://doi.org/10.1038/nature14539.
model in this study. [20] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception
architecture for computer vision, in: IEEE conference on computer vision and
Declaration of Competing Interest pattern recognition (CVPR), 2016, pp. 2818–2826, https://doi.org/10.1109/
CVPR.2016.308.
[21] R. Collobert, J. Weston, A unified architecture for natural language processing:
The authors declare that they have no known competing finan- deep neural networks with multitask learning, in: ICML ’08: Proceedings of the
cial interests or personal relationships that could have appeared 25th International Conference on Machine Learning, 2008, pp. 160–167,
https://doi.org/10.1145/1390156.1390177.
to influence the work reported in this paper.
[22] G. Hinton et al., Deep neural networks for acoustic modeling in speech
recognition: the shared views of four research groups, IEEE Signal Process Mag.
Acknowledgments 29 (2012) 82–97, https://doi.org/10.1109/MSP.2012.2205597.
[23] K. Fukushima, Neocognitron: a self-organizing neural network model for a
mechanism of pattern recognition unaffected by shift in position, Biol. Cybern.
The authors are grateful to the anonymous referees for their 36 (1980) 193–202, https://doi.org/10.1007/BF00344251.
constructive comments and valuable suggestions which have [24] Y. LeCun, Y. Bengio, Convolutional networks for images, speech, and time
helped us very much to improve the quality of the paper. series, in: The Handbook of Brain Theory and Neural Networks, MIT Press,
Cambridge, 1995, pp. 255–258.
[25] J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, W. Xu, CNN-RNN: a unified
References framework for multi-label image classification, in: IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2285–2294,
[1] N.E. Stork, How many species of insects and other terrestrial arthropods are https://doi.org/10.1109/CVPR.2016.251.
there on Earth?, Annu. Rev. Entomol. 63 (2018) 31–45, https://doi.org/ [26] F. Schroff, D. Kalenichenko, J. Philbin, Facenet: a unified embedding for face
10.1146/annurev-ento-020117-043348. recognition and clustering, in: IEEE Conference on Computer Vision and
[2] M. Pinzari, M. Santonico, G. Pennazza, E. Martinelli, R. Capuano, R. Paolesse, Pattern Recognition (CVPR), 2015, pp. 815–823, https://doi.org/10.1109/
et al., Chemically mediated species recognition in two sympatric Grayling CVPR.2015.7298682.
butterflies: Hipparchia fagi and Hipparchia hermione (Lepidoptera: [27] D. Cheng, Y. Gong, S. Zhou, J. Wang, N. Zheng, Person re-identification by
Nymphalidae, Satyrinae), PLoS One 13 (6) (2018), https://doi.org/10.1371/ multi-channel parts-based CNN with improved triplet loss function, in: IEEE
journal.pone.0199997. Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp.
[3] Y. Kaya, O.F. Ertuğrul, R. Tekin, Two novel local binary pattern descriptors for 1335–1344, https://doi.org/10.1109/CVPR.2016.149.
texture analysis, Appl. Soft Comput. 34 (2015) 728–735, https://doi.org/ [28] P. Sermanet, Y. LeCun, Traffic sign recognition with multi-scale Convolutional
10.1016/j.asoc.2015.06.009. Networks, in: The 2011 International Joint Conference on Neural Networks,
[4] Y. Kaya, L. Kaycı, Application of artificial neural network for automatic 2011, pp. 2809–2813, https://doi.org/10.1109/IJCNN.2011.6033589.
detection of butterfly species using color and texture features, Visual Comput. [29] M. Liang, X. Hu, Recurrent convolutional neural network for object recognition,
30 (2014) 71–79, https://doi.org/10.1007/s00371-013-0782-8. in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015,
[5] C. Wen, D. Guyer, Image-based orchard insect automated identification and pp. 3367–3375, https://doi.org/10.1109/CVPR.2015.7298958.
classification method, Comput. Electron. Agric. 89 (2012) 110–115, https://doi. [30] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep
org/10.1016/j.compag.2012.08.008. convolutional neural networks, Commun. ACM 60 (2017) 84–90, https://doi.
[6] C. Xie, J. Zhang, R. Li, J. Li, P. Hong, J. Xia, P. Chen, Automatic classification for org/10.1145/3065386.
field crop insects via multiple-task sparse representation and multiple-kernel [31] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale
learning, Comput. Electron. Agric. 119 (2015) 123–132, https://doi.org/ image recognition, International Conference on Learning Representations,
10.1016/j.compag.2015.10.015. 2015.
[7] L. Feng, B. Bhanu, J. Heraty, A software system for automated identification and [32] A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M.
retrieval of moth images based on wing attributes, Pattern Recogn. 51 (2016) Andreetto, H. Adam, Mobilenets: efficient convolutional neural networks for
225–241, https://doi.org/10.1016/j.patcog.2015.09.012. mobile vision applications, (2017) arXiv:1704.04861.
[8] M. Martineau, D. Conte, R. Raveaux, I. Arnault, D. Munier, G. Venturini, A survey [33] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in:
on image-based insect classification, Pattern Recogn. 65 (2017) 273–284, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016,
https://doi.org/10.1016/j.patcog.2016.12.020. pp. 770–778, https://doi.org/10.1109/CVPR.2016.90.
[9] Q. Yao, J. Lv, Q.-J. Liu, G.-Q. Diao, B.-J. Yang, H.-M. Chen, J. Tang, An insect [34] J. Deng, W. Dong, R. Socher, L. Li, K. Li, L. Fei-Fei, Imagenet: a large-scale
imaging system to automate rice light-trap pest identification, J. Integr. Agric. hierarchical image database, in: 2009 IEEE conference on computer vision and
11 (2012) 978–985, https://doi.org/10.1016/S2095-3119(12)60089-6. pattern recognition, 2009, pp. 248–255, https://doi.org/10.1109/
[10] F. Faithpraise, P. Birch, R. Young, J. Obu, B. Faithpraise, C. Chatwin, Automatic CVPR.2009.5206848.
plant pest detection and recognition using k-means clustering algorithm and [35] A. Canziani, A. Paszke, E. Culurciello, An analysis of deep neural network
correspondence filters, Int. J. Adv. Biotechnol. Res. 4 (2013) 189–199. http:// models for practical applications, (2016) arXiv:1605.07678.
sro.sussex.ac.uk/id/eprint/49042.