0% found this document useful (0 votes)
29 views

Plant Disease Detection Using Machine Learning

Uploaded by

40823210008
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Plant Disease Detection Using Machine Learning

Uploaded by

40823210008
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

Plant Disease Detection Using Machine


2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS) | 979-8-3503-7222-9/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICETSIS61505.2024.10459349

Learning

Anamika Jain
School of Computer Anagha Langhe Harsh Choudhary Ashutosh Mishra
Engineering and Technology School of Computer School of Computer School of Computer
Dr. Vishwanath Karad MIT Engineering and Technology Engineering and Technology Engineering and Technology
World Peace University Dr. Vishwanath Karad MIT Dr. Vishwanath Karad MIT Dr. Vishwanath Karad MIT
Pune, India World Peace University World Peace University World Peace University
[email protected] Pune, India Pune, India Pune, India
n [email protected] [email protected] [email protected]
n n n

Abstract—In the field of agriculture being able to uses machine learning for disease detection in plants
identify plant diseases is extremely important as it can has emerged as a solution [9]. This system enables the
lead to crop loss and negatively impact food security. identification of plant types affected by various
Detecting these diseases early on is crucial, to prevent diseases. By providing farmers with this information,
their spread and minimize any damage. However, this
task often requires a high amount of labor and
they can take measures before the diseases spread
experience. This project suggests using image processing further. The aim of this research paper is to discuss
techniques to extract characteristics from images of the need for a machine learning-based system for
plant leaves and then utilizing machine learning detecting plant diseases and highlight its impact on
algorithms to classify these leaves as either healthy or the sector. The paper explores the components of this
diseased. The evaluation of machine learning system and how they work together to enable
algorithms, such as Convolutional Neural Networks, real-time monitoring and early diagnosis of plant
was conducted to measure their performance, using diseases. Furthermore, it examines the advantages of
their accuracy, precision, and recall. The proposed implementing such a system in agriculture practices,
system displays a level of accuracy in detecting and
classifying plant diseases. It focuses on categorizing
including increased crop yields decreased reliance, on
images of types of plants into disease types as well as pesticides, and reduced environmental impact. The
healthy ones. These findings highlight the potential discussion also covers the difficulties and constraints
benefits of employing machine learning techniques in linked to putting such a system into practice along,
the detection of plant diseases offering farmers a tool for with approaches to address them. By harnessing
managing and safeguarding their crops. machine learning technology, this system shows
promise in aiding farmers to enhance crop yields
Keywords— Image processing, machine learning, mitigating the effects of diseases on the environment,
classification, agriculture, crop management, feature
and ensuring a dependable supply of food for our
extraction
growing population.
I. INTRODUCTION
II. LITERATURE SURVEY
Agriculture plays a role in our society as it provides
Convolutional Neural Networks (CNN) were used by
us with resources like food, fiber, and fuel to support
Jun Liu and Xuewei Wang [7], to detect potato leaf
our growing population. However, plant diseases can
diseases. They proposed a method that utilized
pose challenges to both the economy and the
transfer learning to improve the accuracy of the
environment by reducing food production. Traditional
model, which resulted in a classification accuracy of
methods of identifying plant diseases rely on manual
99.1%.
inspections which are time-consuming, prone to
Different machine-learning models for plant disease
errors, and not always effective in managing these
detection were compared by the authors of [1]. They
diseases. To overcome these obstacles, a system that

979-8-3503-7222-9/24/$31.00 ©2024 IEEE 1373


Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

found that the Random Forest algorithm gave the best machine learning. The approach involves generating
accuracy with a small dataset. dynamic multitask models through sequential
The authors of [2] used a Random Forest Classifier extensions and generalizations. This when applied to
for plant disease detection and achieved an average image classification using the Plant Village dataset,
accuracy of 93% along with a mean F1 score of 0.93. their μ2Net+ model outperforms ResNet, DenseNet,
They employed statistical image processing and and NASNet in accuracy, F1 score, and AUC-ROC
machine learning models in their system and stressed metrics. The paper suggests potential future work to
on the need for more research on field-based plant enhance system capabilities across multiple
disease detection and standardized datasets for modalities.
evaluating the performance of different image An adaptive minimal ensemble method with
processing and machine learning techniques. EfficientNet CNNs is proposed by Bruno A. et. al. in
The authors of [3] proposed a model using [12]. They augment the Plant Village dataset and
DWT+PCA+GLCM+CNN for leaf disease detection achieve state-of-the-art accuracy at 99.1%. However,
and classification. They employed SVM, KNN, and there is an absence of comparisons with other
CNN for feature classification and identified the need methods and a lack of exploration of alternative CNN
for larger and more diverse datasets and architecture along with the impact of different data
high-resolution images for training and testing the augmentation techniques.
models. Schwarz Schuler et. al. in [13] use a lightweight
In [5], the authors took a publicly available dataset DCNN for plant leaf disease classification using a
comprising pictures of both unhealthy and healthy modified Inception V3 architecture with two branches
plant leaves and used deep convolutional neural for L and AB channels. After testing on the Plant
networks for plant disease detection. They identified Village dataset, the proposed method achieves
the performance of different deep learning networks 99.06% accuracy, outperforming many
and training and dataset types on the accuracy of the state-of-the-art methods. While the paper lacks
models. The authors found that the accuracy of the explicit discussion of research gaps, it excels in
models reduced greatly when tested on photographs evaluating metrics like precision, recall, F1-score, and
captured under varying conditions compared to those AUC-ROC.
employed during the training phase, highlighting the
need for more diverse training data to improve
accuracy. III. OBJECTIVES
A comprehensive Android application incorporating The proposed system focuses on improving
TensorFlow Lite using a CNN was built by the agricultural productivity and sustainability through
authors of [6]. They used the Plant Village dataset. various methods. Some objectives are –
After preprocessing the authors identified that the
images taken for training under laboratory conditions 1. Early Detection and Diagnosis of Plant
differ significantly from the real-time images taken in Diseases: The main goal of the system is to detect
a farmer’s field. This decreases the overall accuracy and diagnose plant diseases early, allowing farmers
of models and requires further diversification of the to take necessary measures before the disease
training datasets. spreads and causes crop losses.
In [9], Shruti Aggarwal et al. discuss devices that 2. Reduction of Pesticide Use: The system aims to
machine learning techniques to enhance rice reduce the use of pesticides and other chemicals by
production. The paper provides an overview and helping farmers identify and manage plant diseases
analysis of numerous papers published in the past early, reducing the need for chemical substitutions.
eight years, covering various methodologies related to 3. Improvement of Crop Yield and Quality: By
the identification of crop diseases, seedling health, locating the presence of plant diseases early, the
and grain quality. system aims to improve crop yield and quality
The authors of [10] use a color-aware two-branch which leads to an increase in profits.
DCNN for efficient plant disease classification which 4. Sustainable Agriculture: By promoting
outperforms the baseline model. Using the Plant sustainable agriculture practices, the system aims
Village dataset, the two-branch architecture achieves to contribute to environmental conservation and
98.2% accuracy and a multiclass F1 score of 0.981. food security, ensuring that future generations can
However, the paper lacks comparisons with transfer benefit from healthy farming practices.
learning methods and exploration of alternative color 5. Accessibility and Affordability: The system aims
spaces. to be accessible to farmers, regardless of their
In [11] Andrea G. presents a continual development location or financial status, providing them with
methodology for unbounded intelligent systems in

1374
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

disease diagnoses that are crucial for their 275 images of Cedar Apple Rust, and 630 images of
livelihoods. Scab.
The system offers a promising solution for the
management of plant diseases in agriculture, leading
to more sustainable environment-friendly farming
practices.

IV. SYSTEM ARCHITECTURE

Fig. 2. Apple Dataset


Fig. 1. Hight Level Diagram
In the Corn dataset, we have used 3852 images, out of
The input for the system includes an image dataset of which 1162 are of the Healthy category, 513 are of
each plant. The dataset is preprocessed and then given Cercospora leaf spot, 1192 are of Common Rust, and
to the machine learning model. The modules have two 985 are of Northern Leaf Blight.
major parts, the preprocessing module, and the
machine learning model that we have chosen i.e., the
Convolutional Neural Network (CNN).
The preprocessing module converts the image dataset
into a TensorFlow dataset and then caches and
shuffles that data. This data is then resized and
rescaled and then a data augmentation algorithm is
applied to it. After the preprocessing and
augmentation phase, the dataset is prefetched for
efficient usage.
The second module, i.e., the Machine Learning
model, CNN, takes this image dataset as input and
learns from it. The set of outputs for this system is the
classification model for plant diseases. This system
focuses on the multi-disease classification of multiple
plants.
Fig. 3. Corn Dataset

V. METHODOLOGY In the Grape dataset, we have used 4062 images out


For the classification of multiple diseases of multiple of which 423 are of the Healthy category, 1180 are of
plants, we chose the open Plant Village dataset Black Rot, 1383 are of Esca (Black Measles), and
originally compiled by S.P. Mohanty, available on 1076 are of Leaf Blight.
GitHub. The plants chosen for the classification study
are –
● Apple
● Corn
● Grape
● Potato
All these plants contain the image data for multiple
diseases. In the Apple dataset, we have used a total of
3171 images out of which 1645 are of the Healthy
category. We have included 621 images of Black Rot,

1375
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

Data Augmentation: A data augmentation technique


was implemented after normalization to enrich the
dataset and enhance the model's capacity for
generalization. Leveraging the Keras API, various
transformations were applied to the images. These
included random horizontal and vertical flips,
introducing variations in orientation and perspective.
Additionally, a random rotation with a maximum
angle of 0.2 radians was applied to each image,
simulating the natural variability in image capture
angles.
By using these methods our dataset not only achieves
consistency and compatibility with our model
architecture but also exhibits enhanced robustness and
variability. These qualities are pivotal for improving
model performance and generalization capabilities.
Fig. 4. Grape Dataset

In the Potato dataset, we have used 2152 images out B. Feature Engineering:
of which 152 are in the Healthy category, 1000 are in Feature engineering is a crucial aspect of this project,
the Early Blight category and 1000 are in the Late in this context, it is implicit in the pre-processing and
Blight category. design of the neural network architecture. The key
considerations for feature engineering while
developing this CNN model for plant disease
detection are:
1. Image Pre-processing:
Image resizing and normalization were ensured, i.e., it
was resized to a size of 156x156 pixels and then
scaled to a standard range of 0 to 1. This proved to be
helpful in faster convergence during training.
2. Data Augmentation:
Along with image pre-processing, data augmentation
was also applied, to increase the diversity of the
image data. Data augmentation, in this case, has been
helpful in making the model more robust to variations
Fig. 5. Potato Dataset in orientation and scale. Furthermore, it has helped in
the cases of class imbalance.
3. Model Architecture
A. Data Preprocessing The CNN model was chosen as it has multiple layers,
In the preprocessing phase of our dataset, several key that learn hierarchical representations of the features
steps were undertaken to optimize the images for the in the images and ensure that the different layers
subsequent machine learning model training. capture different levels of abstraction. The pooling
Resizing Images: Initially, the raw images within the layers downsample the spatial dimensions and hence
dataset were uniformly resized to the dimensions of reduce the complexity of computation. The fully
256x256 pixels. This standardization ensures a connected layers i.e., the dense layers at the end of the
consistent input size across all images, an important network, on the other hand, help in combining the
prerequisite for the effective training of machine high-level features for classification.
learning models. Additionally, the optimal learning rate and the batch
Normalization: After resizing, a normalization size were selected considering the speed of
process was implemented to scale the pixel values convergence of the model and the variability of the
within the range of 0 to 1. This normalization is a data respectively.
standard practice in data preprocessing, aiming to
bring all pixel values to a comparable scale. Such
C. Model Architecture for CNN
uniformity facilitates the convergence of machine
learning models during training. The dataset was divided into batches of 32 images
using the TensorFlow Data Input Pipeline. The

1376
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

batches were then shuffled, cached, and prefetched gives the model plot for the architecture used in this
for efficient training of the model. paper.
The model has 6 Convolutional layers with a relu
activation function each followed by a MaxPooling
layer. The output of the final MaxPooling layer is
passed through a Flatten layer to scale it down,
followed by the Dense layer with a relu activation
function. The output layer comprises a dense layer
with a neuron count matching the total number of
classes of that plant image data and the activation as
softmax. The model undergoes compilation using the
Adam optimizer and employs Sparse Categorical
Cross Entropy as its chosen loss function, and
accuracy as the metrics.
The model summary for the model used for the potato
class is given in Fig. 6.

Fig. 7. Model Plot Diagram

Fig. 6. Deep Learning Architecture

While training the model it was observed that the


accuracy reaches 99% at a certain epoch and starts
declining, which means that the model starts
overfitting. Thus, the maximum number of epochs
given for the model to train over was 30 but we added
a call back for Early Stopping when the model
reaches an accuracy of 99% at a certain epoch. Fig. 7.

1377
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

VI. RESULTS AND DISCUSSIONS


A. Sample Observations:

Fig. 11. Sample Observations for Grape

Fig. 8. Sample Observations for Apple


B. Accuracy and Loss Curves:

Fig. 9. Sample Observations for Corn


Fig. 12. Accuracy-Loss Curves for Apple

Fig. 10. Sample Observations for Grape

Fig.13. Accuracy-Loss Curves for Corn

1378
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

Fig. 17. Confusion Matrix for Corn


Fig. 14. Accuracy-Loss Curves for Grape

Fig. 15. Accuracy-Loss Curves for Potato

Fig. 18. Confusion Matrix for Grape

C. Confusion Matrix:

Fig. 19. Confusion Matrix for Potato


Fig. 16. Confusion Matrix for Apple

1379
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

D. Accuracy and F1 Scores for the models: VIII. CONCLUSION


As we found that the dataset was imbalanced, the F-1
TABLE I. ACCURACY VALUES
score for each plant category was computed. The
Plant Train Validation Test overall F-1 score was found to be 0.9521. The overall
1 Apple 0.9945 0.9405 0.9381 accuracy score observed for the model considering all
2 Corn 0.9928 0.9479 0.9545
plants was found to be 96.10%. The accuracy was
3 Grape 0.9944 0.9479 0.9641
4 Potato 0.9942 1.0000 0.9871
found to be more than the previous works studied.
Comparison with other papers-

E. Class wise F1 Scores for all plants: TABLE VI. PAPER ACCURACY VALUE COMPARISONS
Sr. No. Paper Accuracy
TABLE II. APPLE F1 SCORES 1 [2] 93%
Sr. No. Class F1 Score 2 [1] 70%
1 Healthy 0.9565 3 [5] 85.53%
2 Black Rot 0.9310 4 Proposed
96.10%
3 Cedar Apple Rust 0.9259 Method
4 Scab 0.9027
Since the accuracy of our model was found as
TABLE III. CORN F1 SCORES compared to the models presented earlier, this can be
Sr. No. Class F1 Score a promising solution over the plant disease detection
1 Healthy 1.0000 problem using the Plant Village Dataset.
2 Cercospora leaf spot 0.8392
3 Common Rust 1.0000
4 Northern Leaf Blight 0.9053 IX. FUTURE SCOPES
The accurate detection of plant diseases using
TABLE IV. GRAPE F1 SCORES
machine learning models is a critical research area
Sr. No. Class F1 Score with several potential future developments. The first
1 Healthy 0.9231 area of future research is to focus on collecting larger
2 Black Rot 0.9641
and more diverse datasets that include images of
3 Esca (Black Measles) 0.9800
4 Leaf Blight 0.9620
plants affected by different diseases at various stages
of development while ensuring they represent the
TABLE V. POTATO F1 SCORES regions where crops are grown and the environmental
factors that affect the spread of diseases. Additionally,
Sr. No. Class F1 Score
improved data annotation and pre-processing
1 Healthy 0.9167
2 Early Blight 0.9975 techniques can be explored to enhance the data
3 Late Blight 0.9855 quality employed in training machine learning
models.

VII. LIMITATIONS Another area for future research is the development of


The Plant disease detection system has limitations new machine-learning models that can handle
that need to be considered for its successful complex and unstructured data. Cutting-edge models
implementation. The quality and diversity of the depend on the images of plant parts and are not able
training dataset and environmental factors can affect to detect diseases that might affect the entire plant or
the accuracy of the machine-learning model. may occur underground. There is a scope for the
Variability in disease symptoms can also limit the development of models on the basis of other
reliability of disease detection. Additionally, a lack of non-invasive and efficient techniques such as spectral
accessibility in remote areas and compatibility with or hyperspectral imaging for the detection of diseases
various devices can hinder the system's adoption and in plants.
usability. To overcome these limitations, knowledge
sharing among stakeholders, including farmers and The deployment process of these machine learning
researchers, is crucial. By addressing these models in real-world scenarios requires further
limitations, the system can effectively and efficiently development. Though the current models are efficient
promote disease management practices, leading to in laboratory conditions, their performance in the
more sustainable agriculture and higher crop yields. actual field is often affected by various factors, such
as lighting conditions, plant growth stage, and other
environmental factors. Future research should explore

1380
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.
2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)

the development of techniques that can be easily used using computer vision and machine learning algorithms,"
Global Transitions Proceedings, 2022, doi:
by farmers in the field, or the use of mobile 10.1016/j.gltp.2022.03.016.
applications that can capture images of plant parts and [5] M. S. P. Mohanty, D. P. Hughes, and M. Salathé, "Using deep
learning for image-based plant disease detection," Frontiers in
provide real-time diagnosis. Plant Science, vol. 7, 2016, doi: 10.3389/fpls.2016.01419.
[6] V. Suresh, M. Krishnan, M. Hemavarthini, K. Jayanthan, and
D. Gopinath, "Plant Disease Detection using Image
There is a need for a study into the integration of Processing," INTERNATIONAL JOURNAL OF
ENGINEERING RESEARCH & TECHNOLOGY (IJERT),
plant disease detection with precision agricultural vol. 09, no. 03, pp. 424-429, March 2020.
approaches. The coupling of machine learning [7] Deep Kothari , Harsh Mishra , Vishal Pandey , Mihir Gharat,
Rashmi Thakur, 2022, Potato Leaf Disease Detection using
algorithms for this purpose with other technology, Deep Learning, INTERNATIONAL JOURNAL OF
such as drones or IoT sensors, can produce a ENGINEERING RESEARCH & TECHNOLOGY (IJERT)
Volume 11, Issue 11 (November 2022).
comprehensive approach to crop monitoring and [8] Liu and X. Wang, "Plant diseases and pests detection based
management, allowing farmers to detect diseases on deep learning: a review," Plant Methods, vol. 17, no. 22,
2021, doi 10.1186/s13007-021-00722-9.
early and optimize resource utilization while [9] P. Gupta, S. Aggarwal, M. Suchithra, N. Chandramouli, M.
enhancing crop yields. These prospective study fields Sarada, A. Verma, D. Vetrithangam, B. Pant, and B.
Ambachew Adugna, "Rice Disease Detection Using Artificial
may aid in improving the accuracy and effectiveness Intelligence and Machine Learning Techniques to Improvise
Agro-Business," in Scientific Programming, vol. 2022, article
of machine learning algorithms in the context of plant ID 1757888, Hindawi, Jun. 2022, doi:
disease identification 10.1155/2022/1757888.
[10] Schwarz Schuler, Joao Paulo & Romaní, Santiago &
Abdel-nasser, Mohamed & Rashwan, Hatem & Puig,
REFERENCES Domenec. (2022). Color-Aware Two-Branch DCNN for
[1] Ramesh Maniyath, Shima & P V, Vinod & M, Niveditha & R, Efficient Plant Disease Classification. Mendel. 28. 55-62.
Pooja & N, Prasad & N, Shashank & Ram, Hebbar. (2018). 10.13164/mendel.2022.1.055.
Plant Disease Detection Using Machine Learning. 41-45. [11] Gesmundo, A. (Year). A Continual Development
10.1109/ICDI3C.2018.00017.` Methodology for Large-Scale Multitask Dynamic ML
[2] Kulkarni, Pranesh & Karwande, Atharva & Kolhe, Tejas & Systems. arXiv:2209.07326.
Kamble, Soham & Joshi, Akshay & Wyawahare, Medha. [12] Bruno A, Moroni D, Dainelli R, Rocchi L, Morelli S, Ferrari
(2021). Plant Disease Detection Using Image Processing and E, Toscano P and Martinelli M (2022) Improving plant
Machine Learning. disease classification by adaptive minimal ensembling. Front.
[3] Gavhale, Ms & Gawande, Ujwalla. (2014). An Overview of Artif. Intell. 5:868926. doi: 10.3389/frai.2022.868926
the Research on Plant Leaves Disease Detection Using Image [13] Schwarz Schuler, Joao Paulo & Romaní, Santiago &
Processing Techniques. IOSR Journal of Computer Abdel-Nasser, Mohamed & Rashwan, Hatem & Puig,
Engineering. 16. 10-16. 10.9790/0661-16151016. Domenec. (2021). Reliable Deep Learning Plant Leaf Disease
[4] S. S. Harakannanavar, J. M. Rudagi, V. I. Puranikmath, A. Classification Based on Light-Chroma Separated Branches.
Siddiqua, and R. Pramodhini, "Plant leaf disease detection 10.3233/FAIA210157.

1381
Authorized licensed use limited to: SRM UNIVERSITY HARYANA. Downloaded on December 04,2024 at 07:20:02 UTC from IEEE Xplore. Restrictions apply.

You might also like