Report Batch-1
Report Batch-1
A PROJECT REPORT
Submitted by
MAHARAJA K (113120UG07058)
MEENAKSHI G (113120UG07060)
PRAKASH S (113120UG07072)
TARUNIKA K (113120UG07100)
IN
INFORMATION TECHNOLOGY
SIGNATURE SIGNATURE
i
CERTIFICATE FOR EVALUATION
NAME OF THE
S.NO STUDENTS WHO HAS TITLE OF THE NAME OF THE
DONE THE PROJECT PROJECT SUPERVISOR
1. MAHARAJA K
[113120UG07058] AGRI DETECT: Dr. V. SURESH
MACHINE KUMAR, Ph.D
2. MEENAKSHI G LEARNING FOR
[113120UG07060] TIMELY PLANT PROFESSOR,
DISEASE Department of
PRAKASH S IDENTIFICATION Information
3.
USING LEAVES Technology
[113120UG07072]
4. TARUNIKA K
[113120UG07100]
The report of this project work submitted by the above students in partial fulfilment
for the award of Degree of Bachelor of Technology in Information Technology of
Anna University was evaluated and confirmed to be the report of work done by the
above student. This project report was submitted for the viva-voice held on
at Vel Tech Multi Tech Dr. Rangarajan Dr. Sakunthala
Engineering College.
ii
ACKNOWLEDGEMENT
We wish to express our sincere thanks to almighty and the people who
extended their help during the course of our work. We are greatly and profoundly
thankful to our honorable Chairman, Col. Prof. Vel. Shri Dr.R.Rangarajan
B.E.(ELEC), B.E.(MECH), M.S.(AUTO)., D.Sc., & Vice Chairman,
Dr.Sakunthala Rangajaran M.B.B.S., for facilitating us with this opportunity.
We take this opportunity to extend our gratefulness to our respectable Chairperson
& Managing Trustee Smt. Rangarajan Mahalakshmi Kishore B.E., M.B.A.,
for her continuous encouragement.
Our special thanks to our cherish able Vice- President Mr. K.V.D. Kishore
Kumar B.E., M.B.A., for his attention towards students’ community. We also
record our sincere thanks to our honorable Principal, Dr. V. Rajamani M.E.,
Ph.D for his kind support to take up this project and complete it successfully.
We would like to express our special thanks to our Head of the Department,
Mr. R. Prabu, M.Tech., Department of Information Technology and our project
internal guide Dr. V. Suresh Kumar, Ph.D and our project coordinator Dr. M.
Rajesh Khanna, Ph.D for their moral support by taking keen interest on our
project work and guided us all along, till the completion of our project Work and
also by providing with all the necessary information required for developing a
good system with successful completion of the same.
iii
ABSTRACT
Global agriculture is seriously threatened by plant diseases, which influence both
food security and economic stability. Quick response is made possible by real-
time detection to prevent the spread of illness and protect agricultural production.
Achieving high accuracy and low latency detection is very important and it might
be a challenging process, particularly for computationally intensive models. Since
plant leaf image datasets can vary in size, existing research ideas include analysis
carried out using one algorithm only, which may not be suitable for all datasets.
This work develops an automated method for the early detection of diseases
harming plant leaves, providing an innovative approach to this issue. In it, four
distinct machine learning algorithms – Support Vector Machine (SVM), K-
Nearest Neighbor (KNN), Convolutional Neural Networks (CNN), Decision Tree
are used. As MATLAB is a great tool for numerical computing, we made use of
it to provide results with the highest degree of precision possible. This system
allows for the execution of many analyses, with the best techniques being used in
accordance with the needs and the kind of datasets. Numerous findings derived
from four distinct types of algorithms are included in the suggested system. The
Proposed system shows the analysis of different plant leaves diseases, which
predicts the percentage of diseased leaves. As a result, we can select which
algorithm is the best suited one to identify the diseases in plant leaves at an early
stage. By analyzing crop images, it is possible to identify even the smallest signs
of disease, allowing for timely intervention to halt the disease's course. If we
lessen these interventions, sustainable agriculture may develop in harmony with
the environment.
iv
LIST OF FIGURES
v
LIST OF ABBREVIATIONS
ML Machine Learning
SVM Support Vector Machine
CNN Convolutional Neural Networks
KNN K-Nearest Neighbor
PLDD Plant Leaf Disease Detection
RGB Red Green Blue
SGDM Stochastic Gradient Descent with Momentum
GMM Gaussian Mixture Model
HOG Histogram of Oriented Gradients
LBP Local Binary Pattern
vi
TABLE OF CONTENTS
ABSTRACT iv
LIST OF FIGURES v
LIST OF ABBREVIATIONS vi
1 INTRODUCTION 1
1.1 DEFINITION 1
1.2 OBJECTIVE 1
2 LITERATURE SURVEY 3
2.1 PAPER 1 3
2.1.1 ADVANTAGES 3
2.1.2 DISADVANTAGES 4
2.2 PAPER 2 4
2.2.1 ADVANTAGES 4
2.2.2 DISADVANTAGES 5
2.3 PAPER 3 5
2.3.1 ADVANTAGES 5
2.3.2 DISADVANTAGES 6
2.4 PAPER 4 6
2.4.1 ADVANTAGES 7
2.4.2 DISADVANTAGES 7
2.5 PAPER 5 7
2.5.1 ADVANTAGES 8
2.5.2 DISADVANTAGES 8
2.6 PAPER 6 9
2.6.1 ADVANTAGES 9
2.6.2 DISADVANTAGES 9
2.7 PAPER 7 10
2.6.1 ADVANTAGES 10
2.6.2 DISADVANTAGES 10
2.8 PAPER 8 11
2.8.1 ADVANTAGES 11
2.8.2 DISADVANTAGES 11
2.9 PAPER 9 12
2.9.1 ADVANTAGES 12
2.9.2 DISADVANTAGES 13
2.10 PAPER 10 13
2.10.1 ADVANTAGES 13
2.10.2 DISADVANTAGES 14
3 SYSTEM DESIGN 15
3.1 SYSTEM REQUIREMENTS 15
3.1.1 HARDWARE CONFIGURATIONS 15
3.1.2 SOFTWARE CONFIGURATIONS 15
3.2 EXISTING SYSTEM 15
3.2.1 DISADVANTAGE OF EXISTING SYSTEM 16
3.3 PROPOSED SYSTEM 16
3.3.1 ADVANTAGE OF PROPOSED SYSTEM 17
3.4 PLDD PROGRESSION PLAN 18
3.5 UML DIAGRAMS 19
3.5.1 USE CASE DIAGRAM 19
3.5.2 CLASS DIAGRAM 19
3.5.3 SEQUENCE DIAGRAM 20
3.5.4 ACTIVITY DIAGRAM 21
4 MODULES DESCRIPTION 23
4.1 OVERVIEW OF THE PROJECT 23
4.2 MODULES 23
4.2.1 DATA COLLECTION 23
4.2.2 IMAGE PREPROCESSING 24
4.2.3 IMAGE SEGMENTATION 26
4.2.4 FEATURE EXTRACTION 27
4.2.5 MODEL TRAINING 27
4.2.6 MODEL EVALUATION 28
4.2.7 MODEL FINE-TUNING 29
4.3 ALOGRITHMS 30
4.3.1 SUPPORT VECTOR MACHINE 30
4.3.2 CONVOLUTIONAL NEURAL NETWORK 32
4.3.3 K-NEAREST NEIGHBOR 33
4.3.4 DECISION TREE 34
4.4 ANALYSIS OF DATASETS 35
4.4.1 APPLE DATASET 35
4.4.2 GRAPES DATASET 36
4.4.3 MANGO DATASET 37
4.4.4 POTATO DATASET 38
4.4.5 TOMATO DATASET 39
5 TESTING 41
5.1 TESTING OF THE APPLICATION 41
5
APPENDICES
APPENDIX 1
APPENDIX 2
REFERENCES
CHAPTER 1
INTRODUCTION
1.1 DEFINITION
Agriculture is the backbone of human civilization, giving people all over
the world a means of subsistence and a living. However, plant diseases, which are
becoming more common and endangering the security of the world's food supply,
pose a serious threat. Concerns about providing for the nutritional requirements
of a growing population have been accentuated by the significant agricultural
losses brought on by the outbreak of these diseases. Among the most essential
goals of research on diagnosing diseases is to achieve high accuracy and low
latency; nevertheless, this is not an easy task, particularly when dealing with
computationally demanding models. We set out on an endeavor to investigate the
revolutionary potential of Machine Learning (ML) algorithms in transforming our
understanding, diagnosis, and management of plant diseases in the modern
environment of data-driven decision-making, where computational power and
creativity converge. In this regard, the use of ML algorithms presents a viable path
for disease detection and management in terms of accuracy, speed, and scalability.
We want to rethink the paradigms of disease identification, mitigation, and
prevention by utilizing the enormous libraries of agricultural data and
sophisticated algorithms. The potential for early diagnosis, prompt response, and
tailored interventions by automated systems with machine learning skills is
significant. This might lessen the detrimental impact of diseases on crop yield and
preserve the resilience of agricultural ecosystems.
1.2 OBJECTIVE
The primary objective is to develop an unique approach for automating the
early identification of plant leaf diseases. The project will use the synergistic
properties of four different algorithms namely Decision Tree, K-Nearest Neighbor
1
(KNN), Support Vector Machine (SVM), and Convolutional Neural Networks
(CNN) to achieve this goal. Through the utilization of several algorithms, the
research seeks to maximize disease detection accuracy levels. This holistic
strategy demonstrates an intricate understanding of the various machine
learning techniques that are accessible.
2
CHAPTER 2
LITERATURE SURVEY
2.1 Plant Leaf Disease Detection Using Image Processing
Rahul Kundu, Usha Chauhan, S.P.S Chauhan, 2022
2.1.1 ADVANTAGES
Offers a cutting-edge technique for identifying plant diseases using
image processing, advancing the field.
Utilizes sophisticated algorithms like Alex-Net and CNNs, enhancing
accuracy.
Offers practical benefits for agriculture, enabling early disease
detection and intervention.
3
2.1.2 DISADVANTAGES
Lacks comprehensive validation, potentially limiting generalizability.
Relies heavily on the availability and quality of training data.
2.2.1 ADVANTAGES
4
2.2.2 DISADVANTAGES
Lacks original findings and experimental validation and it may lack
original research findings.
Among the most vital duties in agriculture is the detection of plant diseases.
This is something that has a major effect on the economy. Given how prevalent
plant diseases are, finding infestations in plants is an important part of working in
the agriculture sector. It is crucial to constantly be watching the plants in order to
identify diseases in their leaves. This ongoing inspection and monitoring of the
plants requires a lot of human labor as well as time. To put it simply, in order to
examine the plants, some kind of controlled approach is needed. Plant diseases
may be more easily identified using initiatives, which can save time and effort
when identifying damaged leaves. In contrast with existing methods, the
suggested algorithm can more reliably determine and classify infected plants.
2.3.1 ADVANTAGES
Introduces novel leaf disease detection using machine learning,
potentially improving efficiency.
Automates disease detection, reducing manual effort and time.
Machine learning adapts to different species and diseases, enhancing
versatility.
5
2.3.2 DISADVANTAGES
The authenticity and quality of the data used to train have an important
effect on how well the model used for machine learning operates.
6
2.4.1 ADVANTAGES
• Offers an unconventional method for diagnosing and monitoring plant
diseases using deep learning techniques, potentially offering superior
accuracy and efficiency compared to conventional methods.
• Deep learning models may reduce the need for manual inspection and
intervention by automating the detection and diagnostic process, thus
saving time and labor costs.
• Major datasets may be utilized for developing deep learning models
allowing for scalability across different plant species potentially
improving generalization capabilities.
2.4.2 DISADVANTAGES
• The quantity and grade of labeled training data impact the model's
efficacy, which may be limited or biased, affecting the model’s
performance and generalizability.
7
diseases. Plant disease detection and classification has been addressed using a
variety of deep learning techniques over time. Transformer networks have
recently shown a lot of potential in computer vision problems. To detect plant
diseases, this study contrasts these methods with conventional CNN methods.
2.5.1 ADVANTAGES
2.5.2 DISADVANTAGES
• The quality and the diversity of the initial training data significantly
influence the performance of deep learning models. Improper or
unbalanced datasets might lead to inaccurate conclusions and
inadequate findings.
8
2.6 Plant Disease Detection Using Machine Learning Techniques
D. Varshney, B. Babukhanwala, J. Khan, D. Saxena, 2022
Plant disease identification is one of the topics covered in the article where
machine learning methodologies are applied. The inefficiencies of traditional
methods are emphasized, along with the potential for enhanced accuracy and
efficiency offered by machine learning algorithms. In agriculture, early disease
identification is crucial, and machine learning can help with this problem. It may
also briefly mention the specific techniques and datasets used in the study, as well
as the expected outcomes or contributions to the field.
2.6.1 ADVANTAGES
2.6.2 DISADVANTAGES
9
2.7 Plant Disease Detection Using CNN
Garima Shrestha, Deepsikha, Majolica Das, Naiwrita Dey, 2022
2.7.1 ADVANTAGES
• Introduces the use of Convolutional Neural Networks (CNN) for plant
disease detection, indicating a sophisticated and state-of-the-art approach
to solving the problem.
2.7.2 DISADVANTAGES
10
2.8 An Efficient Algorithm for Plant Disease Detection Using Deep
Convolutional Networks
Pratibha Nayar, Shivank Chhibber, Ashwani Kumar Dubey,
2022
2.8.1 ADVANTAGES
2.8.2 DISADVANTAGES
11
• The affordability and calibre of labelled training data are critical
components that determine the effectiveness of deep learning systems.
2.9.1 ADVANTAGES
• The review may assist identify gaps in current research and propose
prospective topics for future investigation.
12
2.9.2 DISADVANTAGES
• Depending on the scope and focus of the review, it may overlook certain
emerging trends or methodologies in plant disease detection.
2.10.1 ADVANTAGES
13
2.10.2 DISADVANTAGES
14
CHAPTER 3
SYSTEM DESIGN
15
focused on establishing the capacity to analyze data directly from living plants.
These systems may more effectively handle the dynamic nature of plant leaf
analysis in real-world circumstances by broadening the scope of machine learning
approaches and including live data processing functions. This method improves
plant leaf analysis's precision and effectiveness while creating new opportunities
for use in the domains of ecology, agriculture, and environmental monitoring.
3.2.1. DISADVANTAGES
16
the need for chemical treatments, supporting sustainable agriculture and
environmental preservation. Furthermore, the system provides farmers with
significant choices resources that enable them to use environmentally and
economically beneficial practices. Therefore, the suggested approach not only
transforms agriculture's approach to managing disease but also establishes the
groundwork for a more robust and sustainable food production system that will
guarantee food security for coming generations.
3.3.1 ADVANTAGES
17
3.4 PLDD PROGRESSION PLAN
18
3.5 UML DIAGRAMS
3.5.1 USE CASE DIAGRAM
19
Classes like imageDatastore, imread, imhist and predict are directly
involved in data processing, image reading, and model
training/prediction.
Classes Data Collection, Image Preprocessing, Image Segmentation,
Feature Extraction, Model Training, Model Evaluation, and Model
Fine Tuning represent the sequential steps in the process flow.
Arrows indicate the flow of data or process from one step to another
or the use of specific functions/classes within each step.
20
The sequence diagram starts with the "User" participant and depicts
interactions with other components or modules in the system.
Each participant represents a module or a logical unit of functionality.
Arrows indicate the flow of control or communication between
participants.
Loops represent iterative processes within the system.
The sequence diagram provides a clear overview of the sequence of
actions and interactions between components.
21
The activity diagram starts with the "start" node and ends with the
"stop" node.
Activities such as loading leaf images, creating labels, combining data,
initializing arrays, and processing images are represented as individual
actions in the diagram.
The "while" loop represents the iterative process of computing color
histograms for each image in the "allImages" array.
Each activity is represented by a rectangular box, and the arrows
indicate the flow of activities.
The activity diagram provides a high-level overview of the activities
performed and their sequence.
22
CHAPTER 4
MODULES DESCRIPTION
4.1 OVERVIEW OF THE PROJECT
The Support Vector Machine (SVM), Convolutional Neural Networks
(CNN), K- Nearest Neighbor (KNN), and Decision Tree algorithms are utilized
to overcome the difficulty of inconsistent plant leaf image databases. Leveraging
the precision of MATLAB for numerical computing, the study highlights the
effectiveness of each algorithm. Remarkably, SVM and achieved 98% accuracy,
with KNN and Decision Tree close behind at 96%. This algorithmic diversity
ensures a nuanced and adaptable system for early disease identification in plant
leaves, offering a promising avenue for sustainable agriculture and timely
intervention.
4.2 MODULES
Data Collection
Image Pre-processing
Image segmentation
Feature extraction
Model Training
Model Evaluation
Model Fine-Tuning
23
healthy and diseased subfolders. The diseased plants are further categorised into
specific diseased names, as illustrated.
24
At the outset, all images underwent resizing to a standard resolution of
256x256 pixels. This resizing step serves multiple purposes. Initially, it promotes
consistency and uniformity by standardizing the size of every image in the
collection. Standardization is crucial for ensuring that the input data fed into
machine learning algorithms are of consistent format and size, thereby minimizing
variations that could potentially affect model performance. Additionally, resizing
mitigates computational challenges associated with working with images of
varying sizes, streamlining the analysis process and improving computational
efficiency. Following resizing, normalization procedures were applied to the
dataset. Normalization involves adjusting pixel values to a standardized scale,
typically ranging from 0 to 1 or -1 to 1. This step is essential for ensuring that
pixel values across different images are comparable and consistent. Normalization
also helps in mitigating the effects of variations in lighting conditions, exposure
settings, and camera characteristics that may be present in the raw images. By
standardizing pixel values, normalization enhances the interpretability and
generalizability of the dataset, making it more robust to variations encountered in
real-world scenarios.
25
actionable insights to support sustainable agriculture practices and enhance crop
productivity.
26
4.2.4 FEATURE EXTRACTION
In order to differentiate between healthy and unhealthy leaves, plant leaf
images must be analyzed using a process called feature extraction, which entails
identifying important traits. Our method extracts characteristics from the leaves
according to their color, shape, and texture. Color histograms are utilized to
capture the distribution of pixel intensities, providing insights into the overall
color composition of the leaves. Shape characteristics, such as leaf morphology
and size, offer valuable information about structural differences between healthy
and diseased leaves.
27
datasets. The dataset is divided into 80% for training and 20% for testing, ensuring
that the model is prepared on a sufficiently large and diverse set of examples while
also allowing for rigorous evaluation of its performance on unseen data. By
combining multiple machine learning techniques, we capitalize on the strengths
of each approach, resulting in a more robust and accurate disease detection
system. This approach holds promise for improving agricultural practices by
enabling early and precise identification of guava plant leaf diseases, ultimately
leading to more effective disease management and crop protection strategies.
The evaluation phase of the trained hybrid model, focusing on 80% of the
plant leaves in the dataset, marks a critical step in assessing its performance and
computational efficiency. The model's strong disease detection capabilities is
shown through careful examination of performance indicators including accuracy
and precision, establishing it as the best solution among several combinations. In
evaluating the model’s performance, computational efficiency emerges as a key
consideration, especially in the context of real-world deployment where timely
28
decision-making is crucial. The trained hybrid model shows its effectiveness in
real-world applications by increasing computing efficiency without
compromising indicators of performance like accuracy and precision.
29
model's behavior and its capacity to distinguish between healthy and diseased
plant leaves. Based on the evaluation results, adjustments were made to the
model's hyperparameters to optimize its performance. Kernel sizes, which
determine the spatial extent of convolutional filters in convolutional neural
networks (CNNs), were fine-tuned to better capture relevant features in the input
images. By experimenting with different kernel sizes and architectures, we
aimed to improve the model's ability to extract meaningful patterns and
distinguish between different types of plant leaf diseases.
Throughout the fine-tuning process, careful attention was paid to
monitoring the model's performance on a validation dataset. This iterative
approach allowed us to systematically evaluate the impact of hyperparameter
adjustments on the model's accuracy and generalization ability. By iteratively
refining the model based on performance feedback, we aimed to achieve the
highest possible accuracy in identifying plant leaf diseases while minimizing the
risk of overfitting or underfitting.
This enhanced accuracy demonstrates the effectiveness of the fine-tuning
approach in refining the model's sensitivity and specificity in detecting plant leaf
diseases. By leveraging optimized hyperparameters and fine-tuned architectures,
the model is better equipped to support decision-making processes in agriculture,
facilitating prompt and focused responses to reduce crop losses and guarantee
food security.
4.3 ALGORITHMS
30
methods perform exceptionally well when trying to find the biggest separation
hyperplane across the numerous classes that make up the target feature.
31
4.3.2 CONVOLUTIONAL NEURAL NETWORKS (CNN)
Convolutional Neural Networks (CNNs) are an example of deep learning
algorithms that perform very well in image recognition and processing tasks. It
is composed of several layers: one or more fully connected layers are used to
predict or classify the image; these layers come after the pooling layers, which
are used to down sample the feature maps and preserve the most important
information.
Convolutional Neural Networks (CNNs) were designed to be
implemented in order to identify leaf images as either healthy or unhealthy. It
first specifies paths to folders that hold pictures of healthy, sick, and mixed
leaves. Next, it defines a custom function called resizeImage to resize the images
and loads and resizes them using image Datastore from the healthy and diseased
directories. It then merges the datastores containing both healthy and sick photos.
The CNN architecture is defined next, consisting of layers for image input,
convolution, batch normalization, ReLU activation, max pooling, and fully
connected layers, followed by softmax and classification layers. Using stochastic
gradient descent with momentum ('sgdm'), training options are set. The defined
layers and aggregated datastores are used to train the CNN model. Following
training, the trained CNN model is used to make predictions after loading. Based
on labels, the script then determines the quantity and proportion of diseased
leaves in the mixed images.
32
4.3.3 K NEAREST NEIGHBOR (KNN)
A reliable and user-friendly ML technique for solving regression and
categorization issues is the K-Nearest Neighbor (KNN) algorithm. By utilizing
the similarity notion, KNN identifies the label or value of a new data point via
its closest neighbors in the training dataset. It is widely relevant in real-life
circumstances since it is not parametric in nature, meaning it does not make any
fundamental inferences about the distribution of data (unlike other algorithms
like GMM, which assume a Gaussian distribution of the specified data). An
attribute-based prior data set (also known as training data) is provided to us,
allowing us to classify coordinates into groups.
It loads images of both damaged and healthy leaves and mixes them, then
extracts characteristics from the Histogram of Oriented Gradients (HOG). It
trains a K-Nearest Neighbors (KNN) classifier with these features. Next, using
a set of mixed leaf photos, it evaluates the classifier and determines, using the
predictions, the percentage of unhealthy leaves in the mixed folder. It shows how
to use the KNN algorithm and HOG features to categorize leaf images as healthy
or unhealthy.
33
4.3.4 DECISION TREE
A modular and user-friendly supervised learning method that works with
both regression and classification issues is the Decision Tree algorithm. It works
by recursively splitting the dataset into subsets based on the most crucial
characteristic or feature that effectively separates the data into homogeneous
groups. As a consequence of these splits, a tree-like structure is produced, where
each internal node represents a characteristic, each branch indicates the split's
conclusion, and each leaf node holds the final prediction or decision.
The script uses a Decision Tree technique based on Local Binary Pattern
(LBP) features to categorize leaf photos as healthy or unhealthy. It first establishes
folder paths for photos of healthy, sick, and mixed leaves. After that, it uses
imageDatastore to load photos from these directories and changes them to
grayscale. To train the decision tree classifier, LBP features are taken from the
grayscale photos of healthy and diseased leaves and related labels are made. The
fitctree function is used to train the decision tree after the features and labels have
been combined. The trained decision tree is then utilised to predict the labels for
the LBP features that have been retrieved from the mixed images. Based on the
predictions, the script determines the proportion of sick leaves in the blended
photos and shows the outcome.
34
4.4 ANALYSIS OF DATASETS
50
40
30
20
10
0
CEDAR RUST APPLE SCAB BLACK ROT
35
4.4.2 GRAPES DATASET
A total of 9027 photos were used from the collection; 2115 of these images
show grape leaves in good condition, and 6912 show leaves that are sick. We
have further categorised the photos of sick leaves into categories such as Black
Rot (2360 images), Black Measles (2400 images), and Leaf Blight (2152
images).
Using our system, we conducted the analysis utilising four distinct
algorithms, and the results are shown below.
50
40
30
20
10
0
BLACK_ROT BLACK_MEASLES LEAF_BLIGHT
36
4.4.3 MANGO DATASET
We used a total of 3000 photos from the dataset, of which 500 showed
mango leaves in good health and the remaining 2500 showed leaves that were
sick. We have further categorised the photos of sick leaves into categories such
as 500 images of anthracnose, 500 images of bacterial canker, 500 images of die
back, 500 images of gall midge, and 500 images of powdery mildew.
Using our system, we conducted the analysis utilising four distinct
algorithms, and the results are shown below. The best method for classifying
mango leaf diseases was identified by comparing and analysing the output of
each algorithm. Our results offer insightful information about how well each
algorithm performs and whether it is appropriate for this purpose.
37
MANGO LEAVES DISEASES
80
70
60
50
40
30
20
10
0
SVM KNN DT CNN
38
POTATO LEAVES DISEASES
51.6
51.5
51.4
51.3
51.2
51.1
51
50.9
EARLY_BLIGHT LATE_BLIGHT
39
TOMATO LEAVES DISEASES
90
80
70
60
50
40
30
20
10
0
SVM KNN DT CNN
40
CHAPTER 5
TESTING
41
Figure 19: MATLAB App for Grape Leaves Disease Prediction
42
CHAPTER 6
CONCLUSION AND FUTURE
ENHANCEMENTS
6.1 CONCLUSION
The use of machine learning (ML) algorithms into modern agricultural
practices is expected to bring about a paradigm change in crop management by
providing more proactive and data-driven approaches. Early detection of plant
diseases is necessary for crop infection management and quality assurance of
agricultural goods. The use of machine learning (ML) models has significantly
altered agriculture's understanding of disease and opened up exciting new
avenues for oversight and prevention. This study looks at a number of techniques
for classifying and identifying plant diseases, including a study of healthy and
diseased leaves.
We have demonstrated the disease percentages within various datasets. In
the dataset of Potato Leaves, Early Blight and Late Blight diseases collectively
account for 86.81%, aligning precisely with SVM and CNN algorithms.
Consequently, we deduce that both SVM and CNN algorithms effectively
predict disease percentages for potato leaves. In the case of Guava Leaves, Red
Rust disease is determined to be 40.85%, a figure that coincides precisely with
the CNN Algorithm. Therefore, we assert that CNN is adept at forecasting
disease percentages for guava leaves. Analysis of the Apple Leaves dataset
reveals that Scab Leaves constitute 47.39%, Black Rot stands at 47.06%, and
Cedar Rust amounts to 28.31%. As a result, we conclude that SVM algorithms
are suitable for predicting Scab and Black Rot disease leaves, while CNN is
optimal for Cedar Rust leaves. Upon examination of the Mango Leaves dataset,
it is evident that diseased leaves constitute approximately 50% across all types
of diseases. Thus, we ascertain that CNN algorithms can effectively predict
Anthracnose, Die Black, Gall Midge, and Powdery Mildew disease leaves, while
SVM is recommended for Powdery Mildew disease leaves and Decision Tree
43
algorithms for Bacterial Canker disease leaves. In the dataset of Grapes Leaves,
Black Rot disease leaves contribute roughly 73.61%, Black Measles around
76.58%, and Leaf Blight about 71.58%. Considering these findings, we conclude
that SVM can effectively predict all three diseased leaves, while CNN can also
predict Black Rot and Black Measles disease percentages. Lastly, analysis of the
Tomato Leaves dataset reveals that Bacterial Spot disease leaves account for
57.21%, Early Blight 38.59%, Late Blight 54.56%, Leaf Mold 37.45%, Spider
Mites 51.32%, Target Spot 46.90%, Mosaic Virus 19.01%, and Yellow Leaf
Curl 77.10%. Consequently, SVM algorithms are deemed suitable for predicting
Mosiac virus, Target Spot, Bacterial Spot, Leaf Mold, and Early Blight, Decision
Tree algorithms for Late Blight, and CNN for Bacterial Spot, Leaf Mold, Spider
Mites, Mosaic Virus, and Yellow Leaf Curl to accurately predict disease
percentages.
44
APPENDICES
APPENDIX 1
SVM:
% Load the leaf images from the Healthy and Diseased folders
healthyFolder = 'Healthy'; % Path to Healthy folder
diseasedFolder = 'Diseased'; % Path to Diseased folder
healthyImages = imageDatastore(healthyFolder);
diseasedImages = imageDatastore(diseasedFolder);
45
colorHistograms(i, :) = [redHist; greenHist; blueHist]';
end
% Load the mixed test images
mixedFolder = 'Mixed'; % Path to Mixed folder
mixedImages = imageDatastore(mixedFolder);
CNN:
% Define paths to the folders
healthy_folder = 'Healthy';
diseased_folder = 'Diseased';
46
mixed_folder = 'Mixed';
47
'Plots', 'training-progress');
% Perform prediction
predicted_labels = classify(net, mixed_images);
KNN:
% Load images
healthyImages = imageDatastore('Healthy', 'IncludeSubfolders', true,
'LabelSource', 'foldernames');
diseasedImages = imageDatastore('Diseased', 'IncludeSubfolders', true,
'LabelSource', 'foldernames');
mixedImages = imageDatastore('Mixed', 'IncludeSubfolders', true);
48
% Extract HOG features
trainingFeatures = [];
trainingLabels = [];
for i = 1:numel(allImages.Files)
img = readimage(allImages, i);
features = extractHOGFeatures(img);
trainingFeatures = [trainingFeatures; features];
trainingLabels = [trainingLabels; allImages.Labels(i)];
end
Decision Tree:
% Add your folder paths
healthyFolder = 'Healthy';
49
diseasedFolder = 'Diseased';
mixedFolder = 'Mixed';
% Create labels
healthyLabels = ones(size(healthyFeatures,1),1);
diseasedLabels = zeros(size(diseasedFeatures,1),1);
50
% Train the decision tree
tree = fitctree(features, labels);
51
APPENDIX 2
ACHIEVEMENTS
52
53
54
55
56
REFERENCES
57
pp. 56683-56698, 2021.
[9] P. Nayar, S. Chhibber and A. K. Dubey, "An Efficient Algorithm for
Plant Disease Detection Using Deep Convolutional Networks," 2022
14th International Conference on Computational Intelligence and
Communication Networks (CICN), Al-Khobar, Saudi Arabia, pp. 156-
160, 2022.
[10] G. K. Sandhu and R. Kaur, "Plant Disease Detection Techniques: A
Review," 2019 International Conference on Automation,
Computational and Technology Management (ICACTM), London, UK,
pp. 34-38, 2019.
[11] R. S. K. R, A. Singh, H. J. S V, A. D and J. S. Jayasree, "Plant Disease
Detection and Diagnosis using Deep Learning," 2022 International
Conference for Advancement in Technology (ICONAT), Goa, India, pp.
1-6, 2022.
[12] M. Ş. Soyer, C. Yılmaz, İ. M. Ozcan, F. Cogen and T. Ç. Yıldız,
"LeafLife: Deep Learning Based Plant Disease Detection
Application," 2021 13th International Conference on Electrical and
Electronics Engineering (ELECO), Bursa, Turkey, pp. 398-402, 2021.
[13] A. Suljović, S. Čakić, T. Popović and S. Šandi, "Detection of Plant
Diseases Using Leaf Images and Machine Learning," 2022 21st
International Symposium INFOTEH-JAHORINA (INFOTEH), East
Sarajevo, Bosnia and Herzegovina, pp. 1-4, 2022.
[14] Sunil S. Harakannanavar, Jayashri M. Rudagi, Veena I Puranikmath,
Ayesha Siddiqua, R Pramodhini,Plant leaf disease detection using
computer vision and machine learning algorithms,Global Transitions
Proceedings,Volume 3, Issue 1,2022,ISSN 2666-285X.
58