"CNN For Flower Classification": Bachelor of Technology Computer Science and Engineering
"CNN For Flower Classification": Bachelor of Technology Computer Science and Engineering
"CNN For Flower Classification": Bachelor of Technology Computer Science and Engineering
Bachelor of Technology
in
Computer Science and Engineering
Submitted by:
Deepanshu
Enroll No. A51405217002
DECLARATION
Signature(s)
CERTIFICATE
This is to certify that Deepanshu (Enroll N0. :- A51405217002), student of B.Tech (CSE)
Vth semester, ASET, Amity University Haryana, has done his Integrated Project entitled
“CNN For Flower Classification” under my guidance and supervision during “June 2019 -
July 2019”
The work was satisfactory. He has shown completed education and devotion to the given
project work.
Signature of Supervisor(s)
Date:
(Mr. Anuj Kumar Singh)
Head
Amity School of Engineering & Technology
Amity University Haryana
ACKNOWLEDGMENT
I would like to express my deepest appreciation to all those who provided me the
possibility to complete this project. A special gratitude I give to Mr. Anuj kumar
Singh, whose contribution in stimulating suggestions and encouragement, helped me
to coordinate my project especially in writing this report. His inspiring suggestions and
timely guidance enabled me to perceive the various aspects of the project in a new light.
I would also like to thank my parents & training mates for guiding and encouraging me
throughout the duration of the project.
Contents Page
Declaration i
Certificate ii
Acknowledgment iii
Abstract iv
List of Figures v
1. INTRODUCTION
1.1 Objective
2. BACKGROUND STUDY
3. TECHNOLOGIES USED7
3.1. Keras
3.4.Numpy
4. DESIGN OF PROJECT
4.2 Strides
4.3 Padding
4.4 Non-Linearity (ReLU)
4.5 Pooling Layer
4.6 Fully Connected layer
5. IMPLEMENTATION
6. SCREENSHOTS
8. CONCLUSION
9. REFERENCES
LIST OF FIGURES
Artificial Intelligence has been witnessing a monumental growth in bridging the gap between
the capabilities of humans and machines. Researchers and enthusiasts alike, work on
numerous aspects of the field to make amazing things happen. One of many such areas is the
domain of Computer Vision. The agenda for this field is to enable machines to view the world
as humans do, perceive it in a similar manner and even use the knowledge for a multitude of
tasks such as Image & Video recognition, Image Analysis & Classification, Media Recreation,
Recommendation Systems, Natural Language Processing, etc. The advancements in Computer
Vision with Deep Learning has been constructed and perfected with time, primarily over one
particular algorithm — a Convolutional Neural Network.
In neural networks, a Convolutional neural network (ConvNets or CNNs) is one of the main
categories to do images recognition, images classifications. Objects detections, face
recognition, etc., are some of the areas where CNNs are widely used. CNN image
classification takes an input image, process it and classify it under certain categories (eg., Dog,
Cat, Tiger, Lion). Computers see an input image as an array of pixels and it depends on the
image resolution. Based on the image resolution, it will see h x w x d (h = Height, w = Width,
d = Dimension). eg., An image of 6 x 6 x 3 array of a matrix of RGB (3 refers to RGB values)
and an image of 4 x 4 x 1 array of a matrix of the grayscale image.
The main aim of this project is to use state of the art deep learning methods to build a
Convolutional Neural Network architecture that can recognize flowers with good accuracy.
Not only that the model can recognize a flower from a given input image it can also tell what
kind/category of the flower is, this is also known as Multi-Class Image Classification.
The model that I have developed here uses multiple convolutional, fully connected layers
along with the output Softmax layer for the Forward Propagation part and for the Backward
Propagation part the model uses Adam Optimization algorithm.
Classical ML algorithms often require complex feature engineering. Usually, a deep dive
exploratory data analysis is first performed on the dataset. A dimensionality reduction might
then be done for easier processing. Finally, the best features must be carefully selected to pass
over to the ML algorithm. All of these steps that are done in Image Classification using
Classical Machine learning takes a lot of time and human resources. The complexity involved
in image classification using classical methods is really not scalable for large datasets and
hence the need for algorithms that learns parameters on their own. Classical ML techniques
cannot be easily adapted to different domains and applications.
There are many different types of flowers and new species of flowers have been discovered
continuously. At present, people's life tends to be far away from purest nature, for example,
living in the city. Sometimes we walk in a nearby park and encounter a flower that we have
never noticed before. We may wonder how we would gain knowledge or details of that
specific flower. Given an image of a flower, ordinary people with limited Botanical
knowledge would not be able to tell which species that flower belongs to. With only the
image, there is no way we can obtain further details about the flowers unless consulting a
botanist. In order to search the information over the internet, at least a keyword related to that
flower should be known. Although there is a method of searching images by input image
(Google Image Search ), derived results are often irrelevant to what we want.
There are almost 250,000 named species of flowering plants in the world. Many blooming
flowers can be observed in the garden, park, roadside, and many other locations, and identify
the plants by their flowers can be done only by experienced taxonomists or botanists. Most
people don't have Knowledge about these flowers and in order to know about them, people
usually have to use flowers to guide books or use the relevant websites on the Internet to
browse the information using keywords. Usually, these keyword searching is not practical for
a lot of people, the problem of identifying an object against the background is known to be
difficult. Such difficulty happened for many reasons like; the interference that exists between
the objects features and the background, the object that is meant to be recognized over the
background objects (rest of image) could be huge. And the matching process could face a
major problem like object orientation, size, lightning and many other factors. Flowers are a
type of plant that has many categories; many of those categories or species have very similar
features and looks, while you can find dissimilarity among the same flower species. This
similarity and dissimilarity make the flower recognition process with a highly accurate result
in a very hard challenge. With respect to above-mentioned points, recognizing flowers from
there images using normal ways like websites on the Internet using search engines and search
keywords or via flowers guide books are not efficient and consuming a lot of time and hard to
bring the right result.
This work proposed in this project uses a Convolutional Neural Network Architecture for
Flower Recognition that helps recognizing a flower image in order to get further information
about their species.
CHAPTER 3
TECHNOLOGIES USED
3.1 Keras
Keras is an open-source neural network library written in Python. It is capable of running on
top of TensorFlow, Microsoft Cognitive Toolkit, Theano, or PlaidML .Designed to enable
fast experimentation with deep neural networks, it focuses on being user-friendly, modular,
and extensible. It was developed as part of the research effort of project ONEIROS (Open-
ended Neuro-Electronic Intelligent Robot Operating System),[3] and its primary author and
maintainer is François Chollet, a Google engineer. Chollet also is the author of the XCeption
deep neural network model.
Keras is a high-level neural networks API, capable of running on top
of Tensorflow, Theano, and CNTK. It enables fast experimentation through a high level, user-
friendly, modular and extensible API. Keras can also be run on both CPU and GPU.
Keras was developed and is maintained by Francois Chollet and is part of the Tensorflow core,
which makes it Tensorflows preferred high-level API.
In 2017, Google's TensorFlow team decided to support Keras in TensorFlow's core
library. Chollet explained that Keras was conceived to be an interface rather than a
standalone machine learning framework. It offers a higher-level, more intuitive set of
abstractions that make it easy to develop deep learning models regardless of the
computational backend used. Microsoft added a CNTK backend to Keras as well, available as
of CNTK v2.0.
Keras contains numerous implementations of commonly used neural-network building blocks
such as layers, objectives, activation functions, optimizers, and a host of tools to make
working with image and text data easier. The code is hosted on GitHub, and community
support forums include the GitHub issues page, and a Slack channel.
In addition to standard neural networks, Keras has support for convolutional and recurrent
neural networks. It supports other common utility layers like dropout, batch normalization,
and pooling.
Keras allows users to productize deep models on smartphones (iOS and Android), on the
web, or on the Java Virtual Machine. It also allows use of distributed training of deep-
learning models on clusters of Graphics Processing Units (GPU) and Tensor processing units
(TPU).
3.2 Jupyter Notebook
Jupyter Notebook (formerly IPython Notebooks) is a web-based interactive computational
environment for creating Jupyter notebook documents. The "notebook" term can colloquially
make reference to many different entities, mainly the Jupyter web application, Jupyter Python
web server, or Jupyter document format depending on context. A Jupyter Notebook
document is a JSON document, following a versioned schema, and containing an ordered list
of input/output cells which can contain code, text (using Markdown), mathematics, plots and
rich media, usually ending with the ".ipynb" extension.
A Jupyter Notebook can be converted to a number of open standard output formats
(HTML, presentation slides, LaTeX, PDF, ReStructuredText, Markdown, Python) through
"Download As" in the web interface, via the nbconvert library or "jupyter nbconvert"
command line interface in a shell.
To simplify visualisation of Jupyter notebook documents on the web, the nbconvert library is
provided as a service through NbViewer which can take a URL to any publicly available
notebook document, convert it to HTML on the fly and display it to the user.
IPython
ØMQ
Tornado (web server)
jQuery
Bootstrap (front-end framework)
MathJax
Jupyter Notebook can connect to many kernels to allow programming in many languages. By
default Jupyter Notebook ships with the IPython kernel. As of the 2.3 release (October 2014),
there are currently 49 Jupyter-compatible kernels for many programming languages,
including Python, R, Julia and Haskell.
The Notebook interface was added to IPython in the 0.12 release (December 2011), renamed
to Jupyter notebook in 2015 (IPython 4.0 – Jupyter 1.0). Jupyter Notebook is similar to the
notebook interface of other programs such as Maple, Mathematica, and SageMath, a
computational interface style that originated with Mathematica in the 1980s.
3.3 Scikit – Learn
The scikit-learn project started as scikits.learn, a Google Summer of Code project by David
Cournapeau. Its name stems from the notion that it is a "SciKit" (SciPy Toolkit), a separately-
developed and distributed third-party extension to SciPy. The original codebase was later
rewritten by other developers. In 2010 Fabian Pedregosa, Gael Varoquaux, Alexandre
Gramfort and Vincent Michel, all from the French Institute for Research in Computer Science
and Automation in Rocquencourt, France, took leadership of the project and made the first
public release on February the 1st 2010. Of the various scikits, scikit-learn as well as scikit-
image were described as "well-maintained and popular" in November 2012.
As of 2019, scikit-learn is under active development.
3.4 Numpy
DESIGN OF PROJECT
The architecture that I have used in the project consists of four convolutional layers , two
fully connected layers and an output softmax layer. The summary of the architecture is
shown below :
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_5 (Conv2D) (None, 64, 64, 32) 2432
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 32, 32, 32) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 32, 32, 64) 18496
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 16, 16, 64) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 16, 16, 64) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 16, 16, 96) 55392
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 8, 8, 96) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 8, 8, 96) 83040
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 4, 4, 96) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 4, 4, 96) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 1536) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 786944
_________________________________________________________________
dense_2 (Dense) (None, 128) 65664
_________________________________________________________________
dense_3 (Dense) (None, 5) 645
=================================================================
Total params: 1,012,613
Trainable params: 1,012,613
Non-trainable params: 0
4.1 Convolutional Layer
Convolution is the first layer to extract features from an input image. Convolution preserves
the relationship between pixels by learning image features using small squares of input data. It
is a mathematical operation that takes two inputs such as image matrix and a filter or kernel.
Consider a 5 x 5 whose image pixel values are 0, 1 and filter matrix 3 x 3 as shown in below
Then the convolution of 5 x 5 image matrix multiplies with 3 x 3 filter matrix which is
called “Feature Map”. Convolution of an image with different filters can perform operations
such as edge detection, blur and sharpen by applying filters.
4.2 Strides
Stride is the number of pixels shifts over the input matrix. When the stride is 1 then we move
the filters to 1 pixel at a time. When the stride is 2 then we move the filters to 2 pixels at a
time and so on. The below figure shows convolution would work with a stride of 2.
4.3 Padding
Sometimes filter does not fit perfectly fit the input image. We have two options:
Drop the part of the image where the filter did not fit. This is called valid padding which
keeps only valid part of the image.
ReLU stands for Rectified Linear Unit for a non-linear operation. The output is ƒ(x) =
max(0,x).
Why ReLU is important : ReLU’s purpose is to introduce non-linearity in our ConvNet. Since,
the real world data would want our ConvNet to learn would be non-negative linear values.
There are other non linear functions such as tanh or sigmoid can also be used instead of ReLU.
Most of the data scientists uses ReLU since performance-wise ReLU is better than the other
two.
Max Pooling
Average Pooling
Sum Pooling
Max pooling takes the largest element from the rectified feature map. Taking the largest
element could also take the average pooling. Sum of all elements in the feature map call as
sum pooling.
The layer we call as FC layer, we flattened our matrix into a vector and feed it into a fully
connected layer like a neural network.
In the above diagram, feature map matrix will be converted as vector (x1, x2, x3, …). With the
fully connected layers, we combined these features together to create a model. Finally, we
have an activation function such as softmax or sigmoid to classify the outputs as cat, dog, car,
truck etc.,
Chapter 5
IMPLEMENTATION
def CNN():
cnn1 = Sequential()
cnn1.add(Conv2D(32,kernel_size=(5,5),padding='same', activ
ation='relu',input_shape=(img_size,img_size,3)))
cnn1.add(MaxPooling2D(pool_size=(2,2)))
cnn1.add(Conv2D(64,kernel_size=(3,3),padding ='same' , act
ivation='relu'))
cnn1.add(MaxPooling2D(strides=(2,2),pool_size=(2,2)))
cnn1.add(Dropout(0.35))
cnn1.add(Conv2D(96,kernel_size=(3,3),padding='same',activa
tion='relu'))
cnn1.add(MaxPooling2D(strides=(2,2),pool_size=(2,2)))
cnn1.add(Conv2D(96,kernel_size=(3,3),padding='same',activa
tion='relu'))
cnn1.add(MaxPooling2D(strides=(2,2),pool_size=(2,2)))
cnn1.add(Dropout(0.35))
cnn1.add(Flatten())
cnn1.add(Dense(512,activation='relu'))
cnn1.add(Dense(2*64,activation='relu'))
cnn1.add(Dense(5,activation='softmax'))
cnn1.compile(optimizer=Adam(lr=0.001),loss='categorical_cr
ossentropy',metrics=['accuracy'])
return cnn1
5.2 Model Training :
mode = CNN()
History = mode.fit(x=x_train,y=y_train,epochs=20,batch_size=64,validati
on_data=(x_test,y_test))
Train on 3458 samples, validate on 865 samples
Epoch 1/20
3458/3458 [==============================] - 48s 14ms/step - loss: 1.42
86 - acc: 0.3412 - val_loss: 1.2392 - val_acc: 0.4913
Epoch 2/20
3458/3458 [==============================] - 47s 14ms/step - loss: 1.16
05 - acc: 0.4954 - val_loss: 1.2244 - val_acc: 0.5098
Epoch 3/20
3458/3458 [==============================] - 45s 13ms/step - loss: 1.09
49 - acc: 0.5495 - val_loss: 1.0437 - val_acc: 0.5676
Epoch 4/20
3458/3458 [==============================] - 46s 13ms/step - loss: 1.02
79 - acc: 0.5862 - val_loss: 0.9914 - val_acc: 0.5873
Epoch 5/20
3458/3458 [==============================] - 51s 15ms/step - loss: 0.94
01 - acc: 0.6243 - val_loss: 1.0301 - val_acc: 0.5746
Epoch 6/20
3458/3458 [==============================] - 48s 14ms/step - loss: 0.92
87 - acc: 0.6180 - val_loss: 0.9152 - val_acc: 0.6335
Epoch 7/20
3458/3458 [==============================] - 51s 15ms/step - loss: 0.87
64 - acc: 0.6466 - val_loss: 0.8481 - val_acc: 0.6636
Epoch 8/20
3458/3458 [==============================] - 48s 14ms/step - loss: 0.80
85 - acc: 0.6816 - val_loss: 0.9430 - val_acc: 0.6555
Epoch 9/20
3458/3458 [==============================] - 48s 14ms/step - loss: 0.80
56 - acc: 0.6909 - val_loss: 0.8355 - val_acc: 0.6960
Epoch 10/20
3458/3458 [==============================] - 55s 16ms/step - loss: 0.76
40 - acc: 0.7131 - val_loss: 0.8734 - val_acc: 0.6855
Epoch 11/20
3458/3458 [==============================] - 49s 14ms/step - loss: 0.81
00 - acc: 0.6874 - val_loss: 0.8752 - val_acc: 0.6671
Epoch 12/20
3458/3458 [==============================] - 58s 17ms/step - loss: 0.77
77 - acc: 0.6958 - val_loss: 0.7689 - val_acc: 0.7064
Epoch 13/20
3458/3458 [==============================] - 51s 15ms/step - loss: 0.65
69 - acc: 0.7525 - val_loss: 0.7513 - val_acc: 0.7121
Epoch 14/20
3458/3458 [==============================] - 54s 16ms/step - loss: 0.64
25 - acc: 0.7553 - val_loss: 0.8632 - val_acc: 0.6740
Epoch 15/20
3458/3458 [==============================] - 56s 16ms/step - loss: 0.61
58 - acc: 0.7675 - val_loss: 0.7483 - val_acc: 0.7179
Epoch 16/20
3458/3458 [==============================] - 53s 15ms/step - loss: 0.56
01 - acc: 0.7860 - val_loss: 0.7845 - val_acc: 0.7179
Epoch 17/20
3458/3458 [==============================] - 57s 16ms/step - loss: 0.61
09 - acc: 0.7568 - val_loss: 0.7666 - val_acc: 0.7133
Epoch 18/20
3458/3458 [==============================] - 50s 14ms/step - loss: 0.49
81 - acc: 0.8086 - val_loss: 0.7333 - val_acc: 0.7422
Epoch 19/20
3458/3458 [==============================] - 49s 14ms/step - loss: 0.45
30 - acc: 0.8259 - val_loss: 0.8835 - val_acc: 0.6971
Epoch 20/20
3458/3458 [==============================] - 46s 13ms/step - loss: 0.59
62 - acc: 0.7762 - val_loss: 0.7185 - val_acc: 0.7364
Chapter 6
Screenshots
CHAPTER 7
Scope of Project
The current CNN model is trained on a dataset of approximately four thousand flower images
which had five classes of flowers namely daisy, dandelion, tulip, sunflower and rose. But this
can be upgraded for every flower that mankind has ever seen and most importantly, the model
further can be connected to a database that will give the user some important information
regarding the searched flower.
Conclusion
CNN is a powerful artificial intelligence tool in pattern classification. In this project, a CNN
architecture for classifying flower image classes is being proposed. The CNN architecture is
designed with four convolutional layers. Each convolutional layer with different filtering
window sizes is considered which improves the speed and accuracy in recognition. A max-
pooling technique is implemented to get higher accuracy. The accuracy rate of the proposed
CNN model is 73.64 %.