Plant Disease Detection Using Machine Learning
Plant Disease Detection Using Machine Learning
Learning
Anamika Jain
School of Computer Anagha Langhe Harsh Choudhary Ashutosh Mishra
Engineering and Technology School of Computer School of Computer School of Computer
Dr. Vishwanath Karad MIT Engineering and Technology Engineering and Technology Engineering and Technology
World Peace University Dr. Vishwanath Karad MIT Dr. Vishwanath Karad MIT Dr. Vishwanath Karad MIT
Pune, India World Peace University World Peace University World Peace University
[email protected] Pune, India Pune, India Pune, India
n [email protected] [email protected] [email protected]
n n n
Abstract—In the field of agriculture being able to uses machine learning for disease detection in plants
identify plant diseases is extremely important as it can has emerged as a solution [9]. This system enables the
lead to crop loss and negatively impact food security. identification of plant types affected by various
Detecting these diseases early on is crucial, to prevent diseases. By providing farmers with this information,
their spread and minimize any damage. However, this
task often requires a high amount of labor and
they can take measures before the diseases spread
experience. This project suggests using image processing further. The aim of this research paper is to discuss
techniques to extract characteristics from images of the need for a machine learning-based system for
plant leaves and then utilizing machine learning detecting plant diseases and highlight its impact on
algorithms to classify these leaves as either healthy or the sector. The paper explores the components of this
diseased. The evaluation of machine learning system and how they work together to enable
algorithms, such as Convolutional Neural Networks, real-time monitoring and early diagnosis of plant
was conducted to measure their performance, using diseases. Furthermore, it examines the advantages of
their accuracy, precision, and recall. The proposed implementing such a system in agriculture practices,
system displays a level of accuracy in detecting and
classifying plant diseases. It focuses on categorizing
including increased crop yields decreased reliance, on
images of types of plants into disease types as well as pesticides, and reduced environmental impact. The
healthy ones. These findings highlight the potential discussion also covers the difficulties and constraints
benefits of employing machine learning techniques in linked to putting such a system into practice along,
the detection of plant diseases offering farmers a tool for with approaches to address them. By harnessing
managing and safeguarding their crops. machine learning technology, this system shows
promise in aiding farmers to enhance crop yields
Keywords— Image processing, machine learning, mitigating the effects of diseases on the environment,
classification, agriculture, crop management, feature
and ensuring a dependable supply of food for our
extraction
growing population.
I. INTRODUCTION
II. LITERATURE SURVEY
Agriculture plays a role in our society as it provides
Convolutional Neural Networks (CNN) were used by
us with resources like food, fiber, and fuel to support
Jun Liu and Xuewei Wang [7], to detect potato leaf
our growing population. However, plant diseases can
diseases. They proposed a method that utilized
pose challenges to both the economy and the
transfer learning to improve the accuracy of the
environment by reducing food production. Traditional
model, which resulted in a classification accuracy of
methods of identifying plant diseases rely on manual
99.1%.
inspections which are time-consuming, prone to
Different machine-learning models for plant disease
errors, and not always effective in managing these
detection were compared by the authors of [1]. They
diseases. To overcome these obstacles, a system that
found that the Random Forest algorithm gave the best machine learning. The approach involves generating
accuracy with a small dataset. dynamic multitask models through sequential
The authors of [2] used a Random Forest Classifier extensions and generalizations. This when applied to
for plant disease detection and achieved an average image classification using the Plant Village dataset,
accuracy of 93% along with a mean F1 score of 0.93. their μ2Net+ model outperforms ResNet, DenseNet,
They employed statistical image processing and and NASNet in accuracy, F1 score, and AUC-ROC
machine learning models in their system and stressed metrics. The paper suggests potential future work to
on the need for more research on field-based plant enhance system capabilities across multiple
disease detection and standardized datasets for modalities.
evaluating the performance of different image An adaptive minimal ensemble method with
processing and machine learning techniques. EfficientNet CNNs is proposed by Bruno A. et. al. in
The authors of [3] proposed a model using [12]. They augment the Plant Village dataset and
DWT+PCA+GLCM+CNN for leaf disease detection achieve state-of-the-art accuracy at 99.1%. However,
and classification. They employed SVM, KNN, and there is an absence of comparisons with other
CNN for feature classification and identified the need methods and a lack of exploration of alternative CNN
for larger and more diverse datasets and architecture along with the impact of different data
high-resolution images for training and testing the augmentation techniques.
models. Schwarz Schuler et. al. in [13] use a lightweight
In [5], the authors took a publicly available dataset DCNN for plant leaf disease classification using a
comprising pictures of both unhealthy and healthy modified Inception V3 architecture with two branches
plant leaves and used deep convolutional neural for L and AB channels. After testing on the Plant
networks for plant disease detection. They identified Village dataset, the proposed method achieves
the performance of different deep learning networks 99.06% accuracy, outperforming many
and training and dataset types on the accuracy of the state-of-the-art methods. While the paper lacks
models. The authors found that the accuracy of the explicit discussion of research gaps, it excels in
models reduced greatly when tested on photographs evaluating metrics like precision, recall, F1-score, and
captured under varying conditions compared to those AUC-ROC.
employed during the training phase, highlighting the
need for more diverse training data to improve
accuracy. III. OBJECTIVES
A comprehensive Android application incorporating The proposed system focuses on improving
TensorFlow Lite using a CNN was built by the agricultural productivity and sustainability through
authors of [6]. They used the Plant Village dataset. various methods. Some objectives are –
After preprocessing the authors identified that the
images taken for training under laboratory conditions 1. Early Detection and Diagnosis of Plant
differ significantly from the real-time images taken in Diseases: The main goal of the system is to detect
a farmer’s field. This decreases the overall accuracy and diagnose plant diseases early, allowing farmers
of models and requires further diversification of the to take necessary measures before the disease
training datasets. spreads and causes crop losses.
In [9], Shruti Aggarwal et al. discuss devices that 2. Reduction of Pesticide Use: The system aims to
machine learning techniques to enhance rice reduce the use of pesticides and other chemicals by
production. The paper provides an overview and helping farmers identify and manage plant diseases
analysis of numerous papers published in the past early, reducing the need for chemical substitutions.
eight years, covering various methodologies related to 3. Improvement of Crop Yield and Quality: By
the identification of crop diseases, seedling health, locating the presence of plant diseases early, the
and grain quality. system aims to improve crop yield and quality
The authors of [10] use a color-aware two-branch which leads to an increase in profits.
DCNN for efficient plant disease classification which 4. Sustainable Agriculture: By promoting
outperforms the baseline model. Using the Plant sustainable agriculture practices, the system aims
Village dataset, the two-branch architecture achieves to contribute to environmental conservation and
98.2% accuracy and a multiclass F1 score of 0.981. food security, ensuring that future generations can
However, the paper lacks comparisons with transfer benefit from healthy farming practices.
learning methods and exploration of alternative color 5. Accessibility and Affordability: The system aims
spaces. to be accessible to farmers, regardless of their
In [11] Andrea G. presents a continual development location or financial status, providing them with
methodology for unbounded intelligent systems in
1374
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)
disease diagnoses that are crucial for their 275 images of Cedar Apple Rust, and 630 images of
livelihoods. Scab.
The system offers a promising solution for the
management of plant diseases in agriculture, leading
to more sustainable environment-friendly farming
practices.
1375
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)
In the Potato dataset, we have used 2152 images out B. Feature Engineering:
of which 152 are in the Healthy category, 1000 are in Feature engineering is a crucial aspect of this project,
the Early Blight category and 1000 are in the Late in this context, it is implicit in the pre-processing and
Blight category. design of the neural network architecture. The key
considerations for feature engineering while
developing this CNN model for plant disease
detection are:
1. Image Pre-processing:
Image resizing and normalization were ensured, i.e., it
was resized to a size of 156x156 pixels and then
scaled to a standard range of 0 to 1. This proved to be
helpful in faster convergence during training.
2. Data Augmentation:
Along with image pre-processing, data augmentation
was also applied, to increase the diversity of the
image data. Data augmentation, in this case, has been
helpful in making the model more robust to variations
Fig. 5. Potato Dataset in orientation and scale. Furthermore, it has helped in
the cases of class imbalance.
3. Model Architecture
A. Data Preprocessing The CNN model was chosen as it has multiple layers,
In the preprocessing phase of our dataset, several key that learn hierarchical representations of the features
steps were undertaken to optimize the images for the in the images and ensure that the different layers
subsequent machine learning model training. capture different levels of abstraction. The pooling
Resizing Images: Initially, the raw images within the layers downsample the spatial dimensions and hence
dataset were uniformly resized to the dimensions of reduce the complexity of computation. The fully
256x256 pixels. This standardization ensures a connected layers i.e., the dense layers at the end of the
consistent input size across all images, an important network, on the other hand, help in combining the
prerequisite for the effective training of machine high-level features for classification.
learning models. Additionally, the optimal learning rate and the batch
Normalization: After resizing, a normalization size were selected considering the speed of
process was implemented to scale the pixel values convergence of the model and the variability of the
within the range of 0 to 1. This normalization is a data respectively.
standard practice in data preprocessing, aiming to
bring all pixel values to a comparable scale. Such
C. Model Architecture for CNN
uniformity facilitates the convergence of machine
learning models during training. The dataset was divided into batches of 32 images
using the TensorFlow Data Input Pipeline. The
1376
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)
batches were then shuffled, cached, and prefetched gives the model plot for the architecture used in this
for efficient training of the model. paper.
The model has 6 Convolutional layers with a relu
activation function each followed by a MaxPooling
layer. The output of the final MaxPooling layer is
passed through a Flatten layer to scale it down,
followed by the Dense layer with a relu activation
function. The output layer comprises a dense layer
with a neuron count matching the total number of
classes of that plant image data and the activation as
softmax. The model undergoes compilation using the
Adam optimizer and employs Sparse Categorical
Cross Entropy as its chosen loss function, and
accuracy as the metrics.
The model summary for the model used for the potato
class is given in Fig. 6.
1377
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)
1378
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)
C. Confusion Matrix:
1379
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)
E. Class wise F1 Scores for all plants: TABLE VI. PAPER ACCURACY VALUE COMPARISONS
Sr. No. Paper Accuracy
TABLE II. APPLE F1 SCORES 1 [2] 93%
Sr. No. Class F1 Score 2 [1] 70%
1 Healthy 0.9565 3 [5] 85.53%
2 Black Rot 0.9310 4 Proposed
96.10%
3 Cedar Apple Rust 0.9259 Method
4 Scab 0.9027
Since the accuracy of our model was found as
TABLE III. CORN F1 SCORES compared to the models presented earlier, this can be
Sr. No. Class F1 Score a promising solution over the plant disease detection
1 Healthy 1.0000 problem using the Plant Village Dataset.
2 Cercospora leaf spot 0.8392
3 Common Rust 1.0000
4 Northern Leaf Blight 0.9053 IX. FUTURE SCOPES
The accurate detection of plant diseases using
TABLE IV. GRAPE F1 SCORES
machine learning models is a critical research area
Sr. No. Class F1 Score with several potential future developments. The first
1 Healthy 0.9231 area of future research is to focus on collecting larger
2 Black Rot 0.9641
and more diverse datasets that include images of
3 Esca (Black Measles) 0.9800
4 Leaf Blight 0.9620
plants affected by different diseases at various stages
of development while ensuring they represent the
TABLE V. POTATO F1 SCORES regions where crops are grown and the environmental
factors that affect the spread of diseases. Additionally,
Sr. No. Class F1 Score
improved data annotation and pre-processing
1 Healthy 0.9167
2 Early Blight 0.9975 techniques can be explored to enhance the data
3 Late Blight 0.9855 quality employed in training machine learning
models.
1380
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)
the development of techniques that can be easily used using computer vision and machine learning algorithms,"
Global Transitions Proceedings, 2022, doi:
by farmers in the field, or the use of mobile 10.1016/j.gltp.2022.03.016.
applications that can capture images of plant parts and [5] M. S. P. Mohanty, D. P. Hughes, and M. Salathé, "Using deep
learning for image-based plant disease detection," Frontiers in
provide real-time diagnosis. Plant Science, vol. 7, 2016, doi: 10.3389/fpls.2016.01419.
[6] V. Suresh, M. Krishnan, M. Hemavarthini, K. Jayanthan, and
D. Gopinath, "Plant Disease Detection using Image
There is a need for a study into the integration of Processing," INTERNATIONAL JOURNAL OF
ENGINEERING RESEARCH & TECHNOLOGY (IJERT),
plant disease detection with precision agricultural vol. 09, no. 03, pp. 424-429, March 2020.
approaches. The coupling of machine learning [7] Deep Kothari , Harsh Mishra , Vishal Pandey , Mihir Gharat,
Rashmi Thakur, 2022, Potato Leaf Disease Detection using
algorithms for this purpose with other technology, Deep Learning, INTERNATIONAL JOURNAL OF
such as drones or IoT sensors, can produce a ENGINEERING RESEARCH & TECHNOLOGY (IJERT)
Volume 11, Issue 11 (November 2022).
comprehensive approach to crop monitoring and [8] Liu and X. Wang, "Plant diseases and pests detection based
management, allowing farmers to detect diseases on deep learning: a review," Plant Methods, vol. 17, no. 22,
2021, doi 10.1186/s13007-021-00722-9.
early and optimize resource utilization while [9] P. Gupta, S. Aggarwal, M. Suchithra, N. Chandramouli, M.
enhancing crop yields. These prospective study fields Sarada, A. Verma, D. Vetrithangam, B. Pant, and B.
Ambachew Adugna, "Rice Disease Detection Using Artificial
may aid in improving the accuracy and effectiveness Intelligence and Machine Learning Techniques to Improvise
Agro-Business," in Scientific Programming, vol. 2022, article
of machine learning algorithms in the context of plant ID 1757888, Hindawi, Jun. 2022, doi:
disease identification 10.1155/2022/1757888.
[10] Schwarz Schuler, Joao Paulo & Romaní, Santiago &
Abdel-nasser, Mohamed & Rashwan, Hatem & Puig,
REFERENCES Domenec. (2022). Color-Aware Two-Branch DCNN for
[1] Ramesh Maniyath, Shima & P V, Vinod & M, Niveditha & R, Efficient Plant Disease Classification. Mendel. 28. 55-62.
Pooja & N, Prasad & N, Shashank & Ram, Hebbar. (2018). 10.13164/mendel.2022.1.055.
Plant Disease Detection Using Machine Learning. 41-45. [11] Gesmundo, A. (Year). A Continual Development
10.1109/ICDI3C.2018.00017.` Methodology for Large-Scale Multitask Dynamic ML
[2] Kulkarni, Pranesh & Karwande, Atharva & Kolhe, Tejas & Systems. arXiv:2209.07326.
Kamble, Soham & Joshi, Akshay & Wyawahare, Medha. [12] Bruno A, Moroni D, Dainelli R, Rocchi L, Morelli S, Ferrari
(2021). Plant Disease Detection Using Image Processing and E, Toscano P and Martinelli M (2022) Improving plant
Machine Learning. disease classification by adaptive minimal ensembling. Front.
[3] Gavhale, Ms & Gawande, Ujwalla. (2014). An Overview of Artif. Intell. 5:868926. doi: 10.3389/frai.2022.868926
the Research on Plant Leaves Disease Detection Using Image [13] Schwarz Schuler, Joao Paulo & Romaní, Santiago &
Processing Techniques. IOSR Journal of Computer Abdel-Nasser, Mohamed & Rashwan, Hatem & Puig,
Engineering. 16. 10-16. 10.9790/0661-16151016. Domenec. (2021). Reliable Deep Learning Plant Leaf Disease
[4] S. S. Harakannanavar, J. M. Rudagi, V. I. Puranikmath, A. Classification Based on Light-Chroma Separated Branches.
Siddiqua, and R. Pramodhini, "Plant leaf disease detection 10.3233/FAIA210157.
1381
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.