"CNN For Flower Classification": Bachelor of Technology Computer Science and Engineering

A project on
“CNN For Flower Classification”
Submitted in partial fulfillment of the requirements

for the award of the degree of
Bachelor of Technology
in
Computer Science and Engineering
Submitted by:
Deepanshu
Enroll No. A51405217002
Under the guidance of

Mr. Anuj Kumar Singh
Professor
CSE department
Department of Computer Science &Engineering

Amity School of Engineering
AMITY UNIVERSITY GURGAON, HARYANA
Department of Computer Science and Engineering
Amity School of Engineering and Technology
DECLARATION
I, Deepanshu ,student of B.Tech (Computer Science &Engineering) hereby

declare that the project entitled “CNN For FLOWER CLASSIFICATION”
which is submitted by us to department of Computer Science & Engineering,
Amity School of Engineering & Technology, Amity University Haryana, in partial
fulfillment of the requirement for the award of the degree of Bachelors of
Technology in Computer Science & Engineering, has not been previously formed
the basis for the award of any degree, diploma or other similar title or recognition.
Signature(s)
Name of Candidate :- Deepanshu

Enrollment No. :- A51405217002
Amity School of Engineering & Technology
CERTIFICATE
This is to certify that Deepanshu (Enroll N0. :- A51405217002), student of B.Tech (CSE)
Vth semester, ASET, Amity University Haryana, has done his Integrated Project entitled
“CNN For Flower Classification” under my guidance and supervision during “June 2019 -
July 2019”
The work was satisfactory. He has shown completed education and devotion to the given
project work.
Signature of Supervisor(s)
Date:
(Mr. Anuj Kumar Singh)
ASET, Amity University, Haryana
Head
Amity School of Engineering & Technology
Amity University Haryana
ACKNOWLEDGMENT
I would like to express my deepest appreciation to all those who provided me the
possibility to complete this project. A special gratitude I give to Mr. Anuj kumar
Singh, whose contribution in stimulating suggestions and encouragement, helped me
to coordinate my project especially in writing this report. His inspiring suggestions and
timely guidance enabled me to perceive the various aspects of the project in a new light.
I perceive this opportunity as a big milestone in my career development. I will strive to

use gained skills and knowledge in the best possible way, and I will continue to work
on their improvement, in order to attain desired career objectives. I hope to continue
cooperation with all of you in the future.
I would also like to thank my parents & training mates for guiding and encouraging me
throughout the duration of the project.
Contents Page
Declaration i
Certificate ii
Acknowledgment iii
Abstract iv
List of Figures v
1. INTRODUCTION
1.1 Objective
2. BACKGROUND STUDY
3. TECHNOLOGIES USED7
3.1. Keras
3.2. Jupyter Notebook
3.3. Scikit – Learn
3.4.Numpy
4. DESIGN OF PROJECT
4.1 Convolutional Layer
4.2 Strides
4.3 Padding
4.4 Non-Linearity (ReLU)
4.5 Pooling Layer
4.6 Fully Connected layer
5. IMPLEMENTATION
5.1 Code For the CNN Architecture
5.2 Model Training
6. SCREENSHOTS
7. SCOPE OF THE PROJECT
8. CONCLUSION
9. REFERENCES
LIST OF FIGURES
Figure 1 An array of RGB Matrix

Figure 2 Amount of data vs Performance of algorithms
Figure 3 Image matrix multiplies kernel or filter matrix
Figure 4 Image matrix multiplies kernel or filter matrix
Figure 5 Stride of 2 pixels
Figure 6 Max Pooling
Figure 7 After pooling layer, flattened as FC layer
ABSTRACT
A Convolutional Neural Network (ConvNet /CNN) is a Deep Learning

algorithm that can take in an input image, assign importance (learnable weights
and biases) to various aspects/objects in the image and be able to differentiate one
from the other. The pre-processing required in a ConvNet is much lower as
compared to other classification algorithms. While in primitive methods filters are
hand-engineered, with enough training, ConvNets have the ability to learn these
filters/characteristics. The architecture of a ConvNet is analogous to that of the
connectivity pattern of Neurons in the Human Brain and was inspired by the
organization of the Visual Cortex. Individual neurons respond to stimuli only in a
restricted region of the visual field known as the Receptive Field. A collection of
such fields overlap to cover the entire visual area. In this project, I took a dataset
of approximately four thousand flower images which had five classes of flowers
namely daisy, dandelion, tulip, sunflower and rose. Then I built a ConvNet
architecture and trained on this dataset and on the test set the model achieved an
accuracy of over 74%. The ConvNet architecture consisted of multiple
convolutional layers along with some fully connected and an output softmax layer.
The model here can be used to recognize flowers from a given input image.
CHAPTER 1
INTRODUCTION
Artificial Intelligence has been witnessing a monumental growth in bridging the gap between
the capabilities of humans and machines. Researchers and enthusiasts alike, work on
numerous aspects of the field to make amazing things happen. One of many such areas is the
domain of Computer Vision. The agenda for this field is to enable machines to view the world
as humans do, perceive it in a similar manner and even use the knowledge for a multitude of
tasks such as Image & Video recognition, Image Analysis & Classification, Media Recreation,
Recommendation Systems, Natural Language Processing, etc. The advancements in Computer
Vision with Deep Learning has been constructed and perfected with time, primarily over one
particular algorithm — a Convolutional Neural Network.
In neural networks, a Convolutional neural network (ConvNets or CNNs) is one of the main
categories to do images recognition, images classifications. Objects detections, face
recognition, etc., are some of the areas where CNNs are widely used. CNN image
classification takes an input image, process it and classify it under certain categories (eg., Dog,
Cat, Tiger, Lion). Computers see an input image as an array of pixels and it depends on the
image resolution. Based on the image resolution, it will see h x w x d (h = Height, w = Width,
d = Dimension). eg., An image of 6 x 6 x 3 array of a matrix of RGB (3 refers to RGB values)
and an image of 4 x 4 x 1 array of a matrix of the grayscale image.
Fig1) An array of RGB Matrix

1.1 Objective
The main aim of this project is to use state of the art deep learning methods to build a
Convolutional Neural Network architecture that can recognize flowers with good accuracy.
Not only that the model can recognize a flower from a given input image it can also tell what
kind/category of the flower is, this is also known as Multi-Class Image Classification.
The model that I have developed here uses multiple convolutional, fully connected layers
along with the output Softmax layer for the Forward Propagation part and for the Backward
Propagation part the model uses Adam Optimization algorithm.
Deep Learning vs Classical Machine Learning

Best-in-class performance: Deep networks have achieved accuracies that are far beyond that
of classical ML methods in many domains including speech, natural language, vision, and
playing games. In many tasks, classical ML can’t even compete. For example, the graph
below shows the image classification accuracy of different methods on the ImageNet dataset;
blue color indicates classical ML methods and red color indicates a deep Convolutional Neural
Network (CNN) method. Deep learning blows classical ML out of the water here.
Classical ML algorithms often require complex feature engineering. Usually, a deep dive
exploratory data analysis is first performed on the dataset. A dimensionality reduction might
then be done for easier processing. Finally, the best features must be carefully selected to pass
over to the ML algorithm. All of these steps that are done in Image Classification using
Classical Machine learning takes a lot of time and human resources. The complexity involved
in image classification using classical methods is really not scalable for large datasets and
hence the need for algorithms that learns parameters on their own. Classical ML techniques
cannot be easily adapted to different domains and applications.
Fig2) Amount of data vs Performance of algorithms

CHAPTER 2
Background Study
There are many different types of flowers and new species of flowers have been discovered
continuously. At present, people's life tends to be far away from purest nature, for example,
living in the city. Sometimes we walk in a nearby park and encounter a flower that we have
never noticed before. We may wonder how we would gain knowledge or details of that
specific flower. Given an image of a flower, ordinary people with limited Botanical
knowledge would not be able to tell which species that flower belongs to. With only the
image, there is no way we can obtain further details about the flowers unless consulting a
botanist. In order to search the information over the internet, at least a keyword related to that
flower should be known. Although there is a method of searching images by input image
(Google Image Search ), derived results are often irrelevant to what we want.
There are almost 250,000 named species of flowering plants in the world. Many blooming
flowers can be observed in the garden, park, roadside, and many other locations, and identify
the plants by their flowers can be done only by experienced taxonomists or botanists. Most
people don't have Knowledge about these flowers and in order to know about them, people
usually have to use flowers to guide books or use the relevant websites on the Internet to
browse the information using keywords. Usually, these keyword searching is not practical for
a lot of people, the problem of identifying an object against the background is known to be
difficult. Such difficulty happened for many reasons like; the interference that exists between
the objects features and the background, the object that is meant to be recognized over the
background objects (rest of image) could be huge. And the matching process could face a
major problem like object orientation, size, lightning and many other factors. Flowers are a
type of plant that has many categories; many of those categories or species have very similar
features and looks, while you can find dissimilarity among the same flower species. This
similarity and dissimilarity make the flower recognition process with a highly accurate result
in a very hard challenge. With respect to above-mentioned points, recognizing flowers from
there images using normal ways like websites on the Internet using search engines and search
keywords or via flowers guide books are not efficient and consuming a lot of time and hard to
bring the right result.
This work proposed in this project uses a Convolutional Neural Network Architecture for
Flower Recognition that helps recognizing a flower image in order to get further information
about their species.
CHAPTER 3
TECHNOLOGIES USED
3.1 Keras
Keras is an open-source neural network library written in Python. It is capable of running on
top of TensorFlow, Microsoft Cognitive Toolkit, Theano, or PlaidML .Designed to enable
fast experimentation with deep neural networks, it focuses on being user-friendly, modular,
and extensible. It was developed as part of the research effort of project ONEIROS (Open-
ended Neuro-Electronic Intelligent Robot Operating System),[3] and its primary author and
maintainer is François Chollet, a Google engineer. Chollet also is the author of the XCeption
deep neural network model.
Keras is a high-level neural networks API, capable of running on top
of Tensorflow, Theano, and CNTK. It enables fast experimentation through a high level, user-
friendly, modular and extensible API. Keras can also be run on both CPU and GPU.
Keras was developed and is maintained by Francois Chollet and is part of the Tensorflow core,
which makes it Tensorflows preferred high-level API.
In 2017, Google's TensorFlow team decided to support Keras in TensorFlow's core
library. Chollet explained that Keras was conceived to be an interface rather than a
standalone machine learning framework. It offers a higher-level, more intuitive set of
abstractions that make it easy to develop deep learning models regardless of the
computational backend used. Microsoft added a CNTK backend to Keras as well, available as
of CNTK v2.0.
Keras contains numerous implementations of commonly used neural-network building blocks
such as layers, objectives, activation functions, optimizers, and a host of tools to make
working with image and text data easier. The code is hosted on GitHub, and community
support forums include the GitHub issues page, and a Slack channel.
In addition to standard neural networks, Keras has support for convolutional and recurrent
neural networks. It supports other common utility layers like dropout, batch normalization,
and pooling.
Keras allows users to productize deep models on smartphones (iOS and Android), on the
web, or on the Java Virtual Machine. It also allows use of distributed training of deep-
learning models on clusters of Graphics Processing Units (GPU) and Tensor processing units
(TPU).
3.2 Jupyter Notebook
Jupyter Notebook (formerly IPython Notebooks) is a web-based interactive computational
environment for creating Jupyter notebook documents. The "notebook" term can colloquially
make reference to many different entities, mainly the Jupyter web application, Jupyter Python
web server, or Jupyter document format depending on context. A Jupyter Notebook
document is a JSON document, following a versioned schema, and containing an ordered list
of input/output cells which can contain code, text (using Markdown), mathematics, plots and
rich media, usually ending with the ".ipynb" extension.
A Jupyter Notebook can be converted to a number of open standard output formats
(HTML, presentation slides, LaTeX, PDF, ReStructuredText, Markdown, Python) through
"Download As" in the web interface, via the nbconvert library or "jupyter nbconvert"
command line interface in a shell.
To simplify visualisation of Jupyter notebook documents on the web, the nbconvert library is
provided as a service through NbViewer which can take a URL to any publicly available
notebook document, convert it to HTML on the fly and display it to the user.
IPython Notebook interface

Jupyter Notebook provides a browser-based REPL built upon a number of popular open-
source libraries:
 IPython
 ØMQ
 Tornado (web server)
 jQuery
 Bootstrap (front-end framework)
 MathJax
Jupyter Notebook can connect to many kernels to allow programming in many languages. By
default Jupyter Notebook ships with the IPython kernel. As of the 2.3 release (October 2014),
there are currently 49 Jupyter-compatible kernels for many programming languages,
including Python, R, Julia and Haskell.
The Notebook interface was added to IPython in the 0.12 release (December 2011), renamed
to Jupyter notebook in 2015 (IPython 4.0 – Jupyter 1.0). Jupyter Notebook is similar to the
notebook interface of other programs such as Maple, Mathematica, and SageMath, a
computational interface style that originated with Mathematica in the 1980s.
3.3 Scikit – Learn
Scikit-learn (formerly scikits.learn) is a free software machine learning library for

the Python programming language. It features
various classification, regression and clustering algorithms including support vector
machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to
interoperate with the Python numerical and scientific libraries NumPy and SciPy.
The scikit-learn project started as scikits.learn, a Google Summer of Code project by David
Cournapeau. Its name stems from the notion that it is a "SciKit" (SciPy Toolkit), a separately-
developed and distributed third-party extension to SciPy. The original codebase was later
rewritten by other developers. In 2010 Fabian Pedregosa, Gael Varoquaux, Alexandre
Gramfort and Vincent Michel, all from the French Institute for Research in Computer Science
and Automation in Rocquencourt, France, took leadership of the project and made the first
public release on February the 1st 2010. Of the various scikits, scikit-learn as well as scikit-
image were described as "well-maintained and popular" in November 2012.
As of 2019, scikit-learn is under active development.
3.4 Numpy
NumPy is a general-purpose array-processing package. It provides a high-performance

multidimensional array object, and tools for working with these arrays.
It is the fundamental package for scientific computing with Python. It contains various
features including these important ones:
 A powerful N-dimensional array object
 Sophisticated (broadcasting) functions
 Tools for integrating C/C++ and Fortran code
 Useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional
container of generic data.
Arbitrary data-types can be defined using Numpy which allows NumPy to seamlessly and
speedily integrate with a wide variety of databases.
CHAPTER 4
DESIGN OF PROJECT
The architecture that I have used in the project consists of four convolutional layers , two
fully connected layers and an output softmax layer. The summary of the architecture is
shown below :
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_5 (Conv2D) (None, 64, 64, 32) 2432
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 32, 32, 32) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 32, 32, 64) 18496
_________________________________________________________________
_________________________________________________________________
dropout_3 (Dropout) (None, 16, 16, 64) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 16, 16, 96) 55392
_________________________________________________________________
_________________________________________________________________
conv2d_8 (Conv2D) (None, 8, 8, 96) 83040
_________________________________________________________________
_________________________________________________________________
dropout_4 (Dropout) (None, 4, 4, 96) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 1536) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 786944
_________________________________________________________________
_________________________________________________________________
=================================================================
Total params: 1,012,613
Trainable params: 1,012,613
Non-trainable params: 0
4.1 Convolutional Layer
Convolution is the first layer to extract features from an input image. Convolution preserves
the relationship between pixels by learning image features using small squares of input data. It
is a mathematical operation that takes two inputs such as image matrix and a filter or kernel.
Figure 3: Image matrix multiplies kernel or filter matrix
Consider a 5 x 5 whose image pixel values are 0, 1 and filter matrix 3 x 3 as shown in below
Figure 4: Image matrix multiplies kernel or filter matrix
Then the convolution of 5 x 5 image matrix multiplies with 3 x 3 filter matrix which is
called “Feature Map”. Convolution of an image with different filters can perform operations
such as edge detection, blur and sharpen by applying filters.
4.2 Strides
Stride is the number of pixels shifts over the input matrix. When the stride is 1 then we move
the filters to 1 pixel at a time. When the stride is 2 then we move the filters to 2 pixels at a
time and so on. The below figure shows convolution would work with a stride of 2.
Figure 5 : Stride of 2 pixels
4.3 Padding
Sometimes filter does not fit perfectly fit the input image. We have two options:
 Pad the picture with zeros (zero-padding) so that it fits
 Drop the part of the image where the filter did not fit. This is called valid padding which
keeps only valid part of the image.
4.4 Non Linearity (ReLU)
ReLU stands for Rectified Linear Unit for a non-linear operation. The output is ƒ(x) =
max(0,x).
Why ReLU is important : ReLU’s purpose is to introduce non-linearity in our ConvNet. Since,
the real world data would want our ConvNet to learn would be non-negative linear values.
There are other non linear functions such as tanh or sigmoid can also be used instead of ReLU.
Most of the data scientists uses ReLU since performance-wise ReLU is better than the other
two.
4.5 Pooling Layer

The pooling layers section would reduce the number of parameters when the images are too
large. Spatial pooling also called subsampling or downsampling which reduces the
dimensionality of each map but retains important information. Spatial Pooling can be of
different types:
 Max Pooling
 Average Pooling
 Sum Pooling
Max pooling takes the largest element from the rectified feature map. Taking the largest
element could also take the average pooling. Sum of all elements in the feature map call as
sum pooling.
Figure 6: Max Pooling

4.6 Fully Connected Layer
The layer we call as FC layer, we flattened our matrix into a vector and feed it into a fully
connected layer like a neural network.
Figure 7 : After pooling layer, flattened as FC layer
In the above diagram, feature map matrix will be converted as vector (x1, x2, x3, …). With the
fully connected layers, we combined these features together to create a model. Finally, we
have an activation function such as softmax or sigmoid to classify the outputs as cat, dog, car,
truck etc.,
Chapter 5
IMPLEMENTATION
5.1 Code For the CNN Architecture:
def CNN():
cnn1 = Sequential()
cnn1.add(Conv2D(32,kernel_size=(5,5),padding='same', activ
ation='relu',input_shape=(img_size,img_size,3)))
cnn1.add(MaxPooling2D(pool_size=(2,2)))
cnn1.add(Conv2D(64,kernel_size=(3,3),padding ='same' , act
ivation='relu'))
cnn1.add(MaxPooling2D(strides=(2,2),pool_size=(2,2)))
cnn1.add(Dropout(0.35))
cnn1.add(Conv2D(96,kernel_size=(3,3),padding='same',activa
tion='relu'))
cnn1.add(Conv2D(96,kernel_size=(3,3),padding='same',activa
tion='relu'))
cnn1.add(Dropout(0.35))
cnn1.add(Flatten())
cnn1.add(Dense(512,activation='relu'))
cnn1.add(Dense(2*64,activation='relu'))
cnn1.add(Dense(5,activation='softmax'))
cnn1.compile(optimizer=Adam(lr=0.001),loss='categorical_cr
ossentropy',metrics=['accuracy'])
return cnn1
5.2 Model Training :
mode = CNN()
History = mode.fit(x=x_train,y=y_train,epochs=20,batch_size=64,validati
on_data=(x_test,y_test))
Train on 3458 samples, validate on 865 samples
Epoch 1/20
3458/3458 [==============================] - 48s 14ms/step - loss: 1.42
86 - acc: 0.3412 - val_loss: 1.2392 - val_acc: 0.4913
Epoch 2/20
3458/3458 [==============================] - 47s 14ms/step - loss: 1.16
Epoch 3/20
3458/3458 [==============================] - 45s 13ms/step - loss: 1.09
Epoch 4/20
3458/3458 [==============================] - 46s 13ms/step - loss: 1.02
Epoch 5/20
3458/3458 [==============================] - 51s 15ms/step - loss: 0.94
Epoch 6/20
3458/3458 [==============================] - 48s 14ms/step - loss: 0.92
Epoch 7/20
3458/3458 [==============================] - 51s 15ms/step - loss: 0.87
Epoch 8/20
3458/3458 [==============================] - 48s 14ms/step - loss: 0.80
Epoch 9/20
3458/3458 [==============================] - 48s 14ms/step - loss: 0.80
Epoch 10/20
3458/3458 [==============================] - 55s 16ms/step - loss: 0.76
Epoch 11/20
3458/3458 [==============================] - 49s 14ms/step - loss: 0.81
Epoch 12/20
3458/3458 [==============================] - 58s 17ms/step - loss: 0.77
Epoch 13/20
3458/3458 [==============================] - 51s 15ms/step - loss: 0.65
Epoch 14/20
3458/3458 [==============================] - 54s 16ms/step - loss: 0.64
Epoch 15/20
3458/3458 [==============================] - 56s 16ms/step - loss: 0.61
Epoch 16/20
3458/3458 [==============================] - 53s 15ms/step - loss: 0.56
Epoch 17/20
3458/3458 [==============================] - 57s 16ms/step - loss: 0.61
Epoch 18/20
3458/3458 [==============================] - 50s 14ms/step - loss: 0.49
Epoch 19/20
3458/3458 [==============================] - 49s 14ms/step - loss: 0.45
Epoch 20/20
3458/3458 [==============================] - 46s 13ms/step - loss: 0.59
Chapter 6
Screenshots
CHAPTER 7
Scope of Project
The current CNN model is trained on a dataset of approximately four thousand flower images
which had five classes of flowers namely daisy, dandelion, tulip, sunflower and rose. But this
can be upgraded for every flower that mankind has ever seen and most importantly, the model
further can be connected to a database that will give the user some important information
regarding the searched flower.
Conclusion
CNN is a powerful artificial intelligence tool in pattern classification. In this project, a CNN
architecture for classifying flower image classes is being proposed. The CNN architecture is
designed with four convolutional layers. Each convolutional layer with different filtering
window sizes is considered which improves the speed and accuracy in recognition. A max-
pooling technique is implemented to get higher accuracy. The accuracy rate of the proposed
CNN model is 73.64 %.

"CNN For Flower Classification": Bachelor of Technology Computer Science and Engineering

Uploaded by

Copyright:

Available Formats

"CNN For Flower Classification": Bachelor of Technology Computer Science and Engineering

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

"CNN For Flower Classification": Bachelor of Technology Computer Science and Engineering

Uploaded by

Copyright:

Available Formats

A project on

“CNN For Flower Classification”

Submitted in partial fulfillment of the requirements

Under the guidance of

Department of Computer Science &Engineering

I, Deepanshu ,student of B.Tech (Computer Science &Engineering) hereby

Name of Candidate :- Deepanshu

ASET, Amity University, Haryana

I perceive this opportunity as a big milestone in my career development. I will strive to

3.2. Jupyter Notebook

3.3. Scikit – Learn

4.1 Convolutional Layer

5.1 Code For the CNN Architecture

5.2 Model Training

7. SCOPE OF THE PROJECT

Figure 1 An array of RGB Matrix

A Convolutional Neural Network (ConvNet /CNN) is a Deep Learning

Fig1) An array of RGB Matrix

Deep Learning vs Classical Machine Learning

Fig2) Amount of data vs Performance of algorithms

IPython Notebook interface

Scikit-learn (formerly scikits.learn) is a free software machine learning library for

NumPy is a general-purpose array-processing package. It provides a high-performance

Figure 3: Image matrix multiplies kernel or filter matrix

Figure 4: Image matrix multiplies kernel or filter matrix

Figure 5 : Stride of 2 pixels

 Pad the picture with zeros (zero-padding) so that it fits

4.4 Non Linearity (ReLU)

4.5 Pooling Layer

Figure 6: Max Pooling

Figure 7 : After pooling layer, flattened as FC layer

5.1 Code For the CNN Architecture:

You might also like