Flower Classification With Deep CNN and Machine Learning Algorithms
Flower Classification With Deep CNN and Machine Learning Algorithms
Learning Algorithms
Büşra Rümeysa Mete Tolga Ensari
Computer Engineering Department Computer Engineering Department
Istanbul Kultur University Istanbul University - Cerrahpasa
Istanbul, Turkey Istanbul, Turkey
[email protected] [email protected]
Abstract—Development of the recognition of rare plant species species, such as elecampane, verbascum thapsus whose life is
will be advantageous in the fields such as the pharmaceutical limited to a specific area and grown only under special climatic
industry, botany, agricultural, and trade activities. It was also conditions, will support the development of the pharmaceutical
very challenging that there is diversity of flower species and it
is very hard to classify them when they can be very similar industry. Thanks to the studies in this field, little-known plants
to each other indeed. Therefore, this subject has already become will be able to see the true value they deserve.
crucial. In this context, this paper presents a classification system
for flower images by using Deep CNN and Data Augmentation. II. R ELATED W ORK
Recently, Deep CNN techniques have become the latest technol- In order to classify flowers, researchers were paying more
ogy for such problems. However, the fact is that getting better
performance for the flower classification is stuck due to the lack of attention to segmenting images and selecting them in some
labeled data. In the study, there are three primary contributions: ways that can be called as artificial. This traditional method
First, we proposed a classification model to cultivate the perfor- is more primitive today when development of the computer
mance of classifying of flower images by using Deep CNN for science and the technology -coming with it- have been getting
extracting the features and various machine learning algorithms improved unceasingly, because it is not running in a fully
for classifying purposes. Second, we demonstrated the use of
image augmentation for achieving better performance results. automatic process since it is required human interposition.
Last, we compared the performances of the machine-learning Moreover, this method can’t achieve high accuracy enough.
classifiers such as SVM, Random Forest, KNN, and Multi-Layer In [1] a visual vocabulary was introduced by the authors
Perceptron(MLP). In the study, we evaluated our classification from Oxford University. This vocabulary includes many differ-
system using two datasets: Oxford-17 Flowers, and Oxford-102 ent features such as colour, shape and texture that discriminate
Flowers. We divided each dataset into the training and test sets
by 0.8 and 0.2, respectively. As a result, we obtained the best between flowers from different classes. They also presented a
accuracy for Oxford 102-Flowers Dataset as 98.5% using SVM study [2] combining four different features of flowers in order
Classifier. For Oxford 17-Flowers Dataset, we found the best to classify them by using a multiple kernel framework with
accuracy as 99.8% with MLP Classifier. These results are better Support Vector Machine (SVM) classifier. So, they reached
than others’ that classify the same datasets in the literature. 88.33% accuracy on the quite challenging datasets of Oxford
Index Terms—machine learning; cnn; feature extraction; data
augmentation; flower classification
102 Flowers which is also one of the dataset used in our study.
In the study [3], fine-grained classification was used by the
authors on the same datasets used in our study that are Oxford
I. I NTRODUCTION
17-Flowers and Oxford-102 Flowers. In this study [3] they
Flowers are the most important producers of the earth that obtained 93.14% and 79.1 accuracy for the datasets of Oxford-
can grow in a wide variety of climates in terms of their 17 Flowers and Oxford-102 Flowers, respectively.
habitats. They also keep on playing a very important role In 2017, the study [4] used Google’s pre-trained model
in the food chain by feeding almost all insect species in the named GoogLeNet with inception-v3 module for the purpose
world. In addition to this they play an important role in the of classifying flowers. The authors selected two datasets same
food chain, and many drugs can be produced by using their as the studies have mentioned above and also in our study:
healing properties. For such reasons, having a good knowledge Oxford 17-Flowers and Oxford-102 Flowers. Although both
of flowers and knowing their species is very important in datasets are not quite large, they have reached quite high
terms of recognizing a new or rare plant species. Otherwise, accuracy of 95% and 94%, respectively.
many plants may be damaged because they are considered In order to make a good comparison, we used the multi-
harmful to one’s farmland or may be sold at very cheap prices. class datasets prepared by Oxford Geometry Group, which
And all this occurs due to the inadequate recognition of the are also used in other studies and also the most challenging
plant species. However, it is a real phenomenon that many datasets in this field. The original side of the study is that
of the plants grown in nature can be cultivated. In addition, we use GoogleNet’s inception-v3 module to extract features.
increasing the recognition capacity of numerous endemic plant Moreover, due to the lack of labeled data, we also used
Authorized licensed use limited to: Rajamangala Univ of Technology Isan provided by UniNet. Downloaded on June 09,2023 at 06:11:44 UTC from IEEE Xplore. Restrictions apply.
a data augmentation technique before the feature extraction
phase. After dividing our data into training and testing, we
classified the features we extracted using inception-v3 with
various machine learning algorithms.
III. DATASET
In this study, we use 17-Flowers and 102-Flowers datasets
belonging to Oxford Visual Geometry Group. Both of these
datasets are composed of various flowers that are common
in England. The 17-Flowers dataset contains 80 images from Fig. 2: Proposed classification system
each flower group while 102-Flowers dataset contains at least
40 and up to 200 images for each category. Images have large
scale, exposure and light variations. In addition to this, there
are many classes that show great differences in the dataset as
well as many classes with similarities to other classes. These
well-known datasets are used for fine-grained recognition
explained in Section I. Sample images of 20 species of flowers
from Oxford-102 Flowers dataset can be seen in Fig. 1.
Authorized licensed use limited to: Rajamangala Univ of Technology Isan provided by UniNet. Downloaded on June 09,2023 at 06:11:44 UTC from IEEE Xplore. Restrictions apply.
wider model was designed. They obtained the concept of width D. Support Vector Machine
within the model by the module called as inception. Support Vector Machine (SVM) is one of the most effective
There are 4 modules for GoogLeNet model in the literature and simplest supervised learning algorithm used in classifica-
that are called as inception-v1 [6], inception-v2 [7], inception- tion. However, it is often used in classification problems.
v3 [7], and inception-v4 [8]. Inception-v3 includes two parts In Support Vector Machines, there is no prior knowledge
that are feature extraction part and classification part. Its fea- or assumption about the distribution of input data. Input
ture extraction part includes convolutional neural network. On data can be separated linearly or non-linearly. In addition,
the other hand its classification part includes fully connected there is no over-fitting problem in SVM. In artificial neural
and softmax layers. All the mentioned layers in inception-v3 networks, over-fitting may occur unless cross validation is
has shown in Fig. 41 . applied. Moreover, various kernel functions can be used to
In this study, we used the previous module of GoogLeNet make inseparable problems detachable and to map data in
which is inception-v3 as a feature extractor by removing better viewing space. Kernel-based algorithms are quite flex-
its classification part. In Fig. 4, feature extraction part and ible. Because, the algorithm is independent from the hyper
removed classification part of the module are seen. We men- parameters such as learning rate and fixing parameters. The
tioned the images in datasets in Section III and we fed them other reason for this is that it is sufficient to change the kernel
into the model that we used in the study after removing its function when the problem area changes [10].
classification layer to achieve the purpose of producing a group The most used kernel functions such as polynomial kernel
of tagged feature vectors. (1), linear kernel (2) and Gaussian/RBF (3) kernel were given
Inception-v3 is a well-known architecture. The network’s in the specified equations.
input must be an image with sizes of 299x299 pixels.
K(x, y) = (x × y + 1)d (1)
K(x, y) = x × y (2)
2
||x − y||
K(x, y) = exp(− ) (3)
2σ 2
In this study, we obtained the highest success using SVM
classifier. Having experienced different types of kernels such
as polynomial, linear, Gaussian/RBF, we observed that the
highest accuracy of 98,5% in 114,99 seconds on 102 Flowers
Dataset. We obtained these results by using the LinearSVC
Fig. 4: Feature extraction and classification parts of Inception- library of sklearn with default values of its parameters.
v3.
E. Random Forest
Random Forest is one of the popular models of machine
C. Dimension Reduction learning, because it can be applied to both regression and
classification problems. Furthermore, it gives good results
We experienced that it is very difficult to understand or without hyper parameter estimation. In Random Forest, it is
discover the relationships between features for a multidimen- aimed to increase the classification value by using more than
sional data set during the experiments. So we applied the one decision tree during the classification process. One of
dimension reduction process and showed its outputs. Note that the biggest problems of decision trees is over-learning that
it is just an informative step. can also called as memorizing data or more technically over-
While Principle Component Analysis (PCA) is a linear fitting. To solve this problem, the random forest model selects
feature extraction technique, t-SNE is a non-linear technique and trains tens or even hundreds of different sets of randomly
for dimensionality reduction and it is more suitable for the from both the data set and the feature set. With this method,
visualization of high-dimensional datasets [9]. hundreds of decision trees are created and each decision tree
In this study, after CNN features obtained and saved t- is individually estimated.
distributed stochastic neighbor embedding (t-SNE) technique It is obtained the accuracy of 86.4% in 140.586323 seconds
was applied for an only informative reason. Fig. 5 demon- with Oxford 102 Flowers Dataset.
strates the transforming high dimensional features into two
F. KNN
dimensional feature plane for the datasets Oxford-17 Flowers
and Oxford 102-Flowers, respectively. In KNN, there are many different distance formulas in find-
It is also seen the same color points are frequently clustered ing distance such as Euclidean Distance, Manhattan Distance
jointly in Fig. 5. In this way, it was foreboded that the features and Minkowski Distance. Their equations have given in the
to train our classifier with superior accuracy. following Formulas 4, 5, 6, respectively.
v
u k
uX
1 The figure has been taken from https://codelabs.developers.google.com in t (x − y )2 (4)
i i
06.09.2019 i=1
Authorized licensed use limited to: Rajamangala Univ of Technology Isan provided by UniNet. Downloaded on June 09,2023 at 06:11:44 UTC from IEEE Xplore. Restrictions apply.
Fig. 5: Transforming of the datasets into two dimensional feature plane without applying data augmentation
k
X from KNN classifier by using manhattan distance as 99.1% in
|xi − yi | (5)
1.53 seconds. Finally, the lowest accuracy was obtained from
i=1
the Random Forest classifier as 95,4% in 5,03 seconds. The
k
X findings were demonstrated in Fig.6
( (|xi − yi |)q )1/q (6)
i=1
Authorized licensed use limited to: Rajamangala Univ of Technology Isan provided by UniNet. Downloaded on June 09,2023 at 06:11:44 UTC from IEEE Xplore. Restrictions apply.
Fig. 7: Comparative results for Oxford-102 Flowers Dataset
Fig. 8: Accuracy rates obtained by different classifiers for
Oxford 17 and 102-Flowers datasets
ACKNOWLEDGEMENT
This study derives from the author’s master thesis. It is
supported by Technology and Project Support Unit of Istanbul the consistency of our results in detail. So we are to expand
Kultur University. our findings.
Authorized licensed use limited to: Rajamangala Univ of Technology Isan provided by UniNet. Downloaded on June 09,2023 at 06:11:44 UTC from IEEE Xplore. Restrictions apply.