Plant Disease Detection Using Convolutional Neural Network
Plant Disease Detection Using Convolutional Neural Network
Submitted By
- Suman Chatterjee.
Admn No.-15je001400.
UNDER THE GUIDANCE OF
DHANBAD.
DHANBAD-826004.
April, 2019.
ACKNOWLEDGEMENT
I take this opportunity to express my deep sense of gratitude and respect towards my project
guide, Dr. ACS Rao, Assistant Professor, Department of Computer Science And Engineering,
IIT(ISM) DHANBAD.
I am very much indebted to him for the generosity, expertise and guidance I have received from
him while working on the project.
ADMISSION NO.-15JE001400
Abstract
When plants and crops are affected by pests it affects the agricultural
production of the country. Usually farmers or experts observe the plants
with naked eye for detection and identification of disease. But this
method can be time processing, expensive and inaccurate. Automatic
detection using image processing techniques provide fast and accurate
results. This paper is concerned with a new approach to the
development of plant disease recognition model, based on leaf image
classification, by the use of deep convolutional networks. Advances in
computer vision present an opportunity to expand and enhance the
practice of precise plant protection and extend the market of computer
vision applications in the field of precision agriculture. Novel way of
training and the methodology used facilitate a quick and easy system
implementation in practice. All essential steps required for implementing
this disease recognition model are fully described throughout the paper,
starting from gathering images in order to create a database, assessed
by agricultural experts, a deep learning framework to perform the deep
CNN training. This method paper is a new approach in detecting plant
diseases using the deep convolutional neural network trained and fine-
tuned to fit accurately to the database of a plant’s leaves that was
gathered independently for diverse plant diseases. The advance and
novelty of the developed model lie in its simplicity; healthy leaves and
background images are in line with other classes, enabling the model to
distinguish between diseased leaves and healthy ones or from the
environment by using CNN.
INTRODUCTION
The Dataset:-
The Dataset was taken from kaggle of PlantVillage dataset
present online as such the code was also written on the online
kernel of Kaggle for better computation and analysis of training
loss and validation.
Image Preprocessing and Labelling:
Preprocessing images commonly involves removing low-frequency
background noise, normalizing the intensity of the individual particles
images, removing reflections, and masking portions of images. Image
preprocessing is the technique of enhancing data Furthermore,
procedure of image preprocessing involved cropping of all the images
manually, making the square around the leaves, in order to highlight the
region of interest (plant leaves). During the phase of collecting the
images for the dataset, images with smaller resolution and dimension
less than 500 pixels were not considered as valid images for the
dataset. In addition, only the images where the region of interest was in
higher resolution were marked as eligible candidates for the dataset. In
that way, it was ensured that images contain all the needed information
for feature learning. Many resources can be found by searching across
the Internet, but their relevance is often unreliable. In the interest of
confirming the accuracy of classes in the dataset, initially grouped by a
keywords search, agricultural experts examined leaf images and
labeled all the images with appropriate disease acronym. As it is known,
it is important to use accurately classified images for the training and
validation dataset. Only in that way may an appropriate and reliable
detecting model be developed. In this stage, duplicated images that
were left after the initial iteration of gathering and grouping images into
classes were removed from the dataset.
Neural Network Training :-
Training the deep convolutional neural network for making an image
classification model from a dataset was proposed. Tensor Flow is an
open source software library for numerical computation using data flow
graphs. Nodes in the graph represent mathematical operations, while
the graph edges represent the multidimensional data arrays (tensors)
communicated between them. The flexible architecture allows you to
deploy computation to one or more CPUs or GPUs in a desktop, server,
or mobile device with a single API. Tensor Flow was originally
developed by researchers and engineers working on the Google Brain
Team within Google's Machine Intelligence research organization for
the purposes of conducting machine learning and deep neural networks
research, but the system is general enough to be applicable in a wide
variety of other domains as well. In machine learning, a convolutional
neural network is a type of feed-forward artificial neural network in
which the connectivity pattern between its neurons is inspired by the
organization of the animal visual cortex. Individual cortical neurons
respond to stimuli in a restricted region of space known as the receptive
field. The receptive fields of different neurons partially overlap such that
they tile the visual field. The response of an individual neuron to stimuli
within its receptive field can be approximated mathematically by a
convolution operation. Convolutional networks were inspired by
biological processes and are variations of multilayer perceptron
designed to use minimal amounts of pre-processing. They have wide
applications in image and video recognition, recommender systems and
natural language processing. Convolutional neural networks (CNNs)
consist of multiple layers of receptive fields. These are small neuron
collections which process portions of the input image. The outputs of
these collections are then tiled so that their input regions overlap, to
obtain a higher-resolution representation of the original image; this is
repeated for every such layer. Tiling allows CNNs to tolerate translation
of the input image. Convolutional networks may include local or global
pooling layers, which combine the outputs of neuron clusters. They also
consist of various combinations of convolutional and fully connected
layers, with point wise nonlinearity applied at the end of or after each
layer. A convolution operation on small regions of input is introduced to
reduce the number of free parameters and improve generalization .One
major advantage of convolutional networks is the use of shared weight
in convolutional layers, which means that the same filter (weights bank)
is used for each pixel in the layer; this both reduces memory footprint
and improves performance. The layer’s parameters are comprised of a
set of learnable kernels which possess a small receptive field but
extend through the full depth of the input volume. Rectified Linear Units
(Re LU) are used as substitute for saturating nonlinearities. This
activation function adaptively learns the parameters of rectifiers and
improves accuracy at negligible extra computational cost. In the context
of artificial neural networks, the rectifier is an activation function defined
as:
f (x)=max(0,x)
,where x is the input to a neuron. This is also known as a ramp function
and is analogous to half-wave rectification in electrical engineering. This
activation function was first introduced to a dynamical network by Hahn
loser et al. in a 2000 paper in Nature with strong biological motivations
and mathematical justifications. It has been used in convolutional
networks more effectively than the widely used logistic sigmoid (which is
inspired by probability theory; see logistic regression) and its more
practical counterpart, the hyperbolic tangent. The rectifier is, as of 2015,
the most popular activation function for deep neural networks. Deep
CNN with ReLUs trains several times faster. This method is applied to
the output of every convolutional and fully connected layer. Despite the
output, the input normalization is not required; it is applied after ReLU
nonlinearity after the first and second convolutional layer because it
reduces top-1 and top-5 error rates. In CNN, neurons within a hidden
layer are segmented into “feature maps.” The neurons within a feature
map share the same weight and bias. The neurons within the feature
map search for the same feature. These neurons are unique since they
are connected to different neurons in the lower layer. So for the first
hidden layer, neurons within a feature map will be connected to different
regions of the input image. The hidden layer is segmented into feature
maps where each neuron in a feature map looks for the same feature
but at different positions of the input image. Basically, the feature map is
the result of applying convolution across an image. The convolutional
layer is the core building block of a CNN. The layer's parameters consist
of a set of learnable filters (or kernels), which have a small receptive
field, but extend through the full depth of the input volume. During the
forward pass, each filter is convolved across the width and height of the
input volume, computing the dot product between the entries of the filter
and the input and producing a 2-dimensional activation map of that
filter. As a result, the network learns filters that activate when it detects
some specific type of feature at some spatial position in the input.
Stacking the activation maps for all filters along the depth dimension
forms the full output volume of the convolution layer. Every entry in the
output volume can thus also be interpreted as an output of a neuron
that looks at a small region in the input and shares parameters with
neurons in the same activation map. When dealing with high-
dimensional inputs such as images, it is impractical to connect neurons
to all neurons in the previous volume because such network
architecture does not take the spatial structure of the data into account.
Convolutional networks exploit spatially local correlation by enforcing a
local connectivity pattern between neurons of adjacent layers: each
neuron is connected to only a small region of the input volume. The
extent of this connectivity is a hyper parameter called the receptive field
of the neuron. The connections are local in space (along width and
height), but always extend along the entire depth of the input volume.
Such architecture ensures that the learnt filters produce the strongest
response to a spatially local input pattern. Three hyper parameters
control the size of the output volume of the convolutional layer: the
depth, stride and zero-padding.
1. Depth of the output volume controls the number of neurons in the
layer that connect to the same region of the input volume. All of these
neurons will learn to activate for different features in the input. For
example, if the first Convolutional Layer takes the raw image as input,
then different neurons along the depth dimension may activate in the
presence of various oriented edges, or blobs of color.
2. Stride controls how depth columns around the spatial dimensions
(width and height) are allocated. When the stride is 1, a new depth
column of neurons is allocated to spatial positions only 1 spatial unit
apart. This leads to heavily overlapping receptive fields between the
columns, and also to large output volumes. Conversely, if higher strides
are used then the receptive fields will overlap less and the resulting
output volume will have smaller dimensions spatially.
3. Stride controls how depth columns around the spatial dimensions
(width and height) are allocated. When the stride is 1, a new depth
column of neurons is allocated to spatial positions only 1 spatial unit
apart. This leads to heavily overlapping receptive fields between the
columns, and also to large output volumes. Conversely, if higher strides
are used then the receptive fields will overlap less and the resulting
output volume will have smaller dimensions spatially.