For - Final Checking v.3
For - Final Checking v.3
In partial fulfillment
of the requirements for the degree
Bachelor of Science in Computer Science
July 2024
APPROVAL SHEET
VERA PANAGUITON
Member
ACCEPTED and APPROVED in partial fulfilment of the requirements for the degree
Bachelor of Science in Computer Science.
ii
ABSTRACT
Eggplant (Solanum melongena) is considered one of the leading and profitable vegetable
crops grown by farmers (Philippine Statistics Authority, 2023). It is also one of the most
popular fruit vegetables in the Philippines and is one of the important crops, but it is
susceptible to serious diseases that hinder its production (Martins et al., 2024). In 2020, the
annual eggplant production in Western Visayas was ranked fifth in the Philippines,
contributing to 7.05% of the total eggplant yield in the country. Spanning across 1,605
hectares of cultivated land, the region achieved a remarkable yield of 10,664 kilograms per
The dataset of eggplant leaves was manually annotated to meet the model’s training
requirements. The model was trained using a dataset of annotated eggplant leaf images,
categorized into healthy, Cercospora leaf spot, Verticillium Wilt, and Early blight. The best
performance was achieved for Verticillium wilt, with an F1_score of 0.94. For Early blight
and Cercospora leaf spot, the model’s performance was satisfactory but could be further
healthy and diseased leaves, but further optimization is necessary to enhance accuracy and
robustness, especially for Early blight and Cescospora leaf spot. This research contributes to
the development of advanced disease detection tools for eggplant farmers, enabling them to
iii
ACKNOWLEDGEMENT
Words cannot express our deep gratitude to our adviser, Mr. Ryan Ercel Paderes, for
his invaluable patience, insightful feedback, and continuous encouragement throughout this
journey. His dedication and commitment to our success have been unwavering, and we are
We are also immensely thankful to our defense committee, whose generosity in sharing their
knowledge and expertise was instrumental in the completion of this thesis. Their
constructive criticism and invaluable suggestions helped us refine our work and push the
A special note of thanks goes to our thesis writing II instructor, Mr. John Jowil Orquia, for
his untiring support and mentorship throughout the thesis writing process. His profound
knowledge, coupled with his patience and understanding, has been a cornerstone of our
success. His guidance not only helped shape this thesis but also encouraged us to strive for
excellence.
We extend our heartfelt thanks to Mr. Ricardo Catanghal, PhD, Dean of the College of
Computer Studies, for his comments and suggestions during the defense that help us
understand it further We are also grateful to Dr. Sammy V. Militante, and Dr. Renecynth
Juarigue, Mrs. Fema Rose B. Ecraela, Mrs. Rosarie G. Sanchez, and Ms. Vera Panagution
for their invaluable contributions as members of our thesis committee. Their expert
feedback and insightful suggestions played a crucial role in the development and refinement
iv
We recognize that this accomplishment is not ours alone, but a culmination of the support,
guidance, and wisdom of our advisors, and committee members. Above all, our most
profound thanks go to Almighty God, whose divine guidance and blessings have been our
source of strength and inspiration throughout this journey. Without His grace, this
v
TABLE OF CONTENTS
TITLE PAGE..................................................................................................................
APPROVAL SHEET......................................................................................................ii
ABSTRACT....................................................................................................................iii
ACKNOWLEDGEMENT..............................................................................................iv
LIST OF FIGURES........................................................................................................viii
LIST OF TABLES..........................................................................................................ix
LIST OF FORMULAS...................................................................................................x
CHAPTER I INTRODUCTION
Objectives.......................................................................................................................3
Related Studies................................................................................................................19
Synthesis.........................................................................................................................25
Conceptual Framework...................................................................................................28
vi
CHAPTER III METHODOLOGY
Research Method............................................................................................................29
Research Instrument........................................................................................................35
Ethical Consideration......................................................................................................41
Hyperparameter Optimization........................................................................................41
Conclusion....................................................................................................................47
Recommendation..........................................................................................................48
REFERENCES..................................................................................................................49
APPENDICES...................................................................................................................53
C. Validation result of Mask R-CNN at Learning Rate 0.01, 0.001, and 0.0001
D. Source Code..................................................................................................................61
F. Curriculum Vitae............................................................................................................64
vii
LIST OF FIGURES
Figures
7 Conceptual Framework 27
9 Data Pre-Processing 32
viii
LIST OF TABLES
Table
5 Confusion Matrix 35
ix
LIST OF FORMULAS
Formula
1 Precision Formula 35
2 Sensitivity Formula 37
3 F1_score Formula 37
x
CHAPTER I
INTRODUCTION
vegetable crops grown by farmers (Philippine Statistics Authority, 2023). It is also one of
the most popular fruit vegetables in the Philippines and is one of the important crops, but
it is susceptible to serious diseases that hinder its production (Martins et al., 2024). In
2020, the annual eggplant production in Western Visayas was ranked fifth in the
Philippines, contributing to 7.05% of the total eggplant yield in the country. Spanning
across 1,605 hectares of cultivated land, the region achieved a remarkable yield of 10,664
metric tons compared to the previous year. From 2011 to 2016, production rose from
208,000 metric tons to 235,600 metric tons (Philippine Statistics Authority, 2023).
Melicano, on March 7, 2024, valuable insights were gathered regarding identifying and
ovoid, slender, or round, with spiny or nonspiny calyxes, and lengths varying from 2″ to
over 12″. Mr. Melicano highlighted key disease indicators, such as leaves showing
disease symptoms versus being reduced by animals like chickens. He identified the
primary causes of diseases as bacteria, viruses, fungi, and nematodes. Regarding worm
infestations, he noted that the impact on eggplant fruit depends on the extent of leaf
1
damage. For disease prevention, he recommended proper crop care practices and the
removal of pests such as worms and insects. Mr. Melicano's insights underscored the
crop health.
An automated eggplant disease diagnostic system could provide information for the
prevention and control of eggplant diseases. Numerous diseases that affect eggplant
might reduce its output and quality of produce (Kaniyassery et al., 2022). The
conditions (Singh et al., 2023). Farmers have traditionally relied on visual inspection to
labor-intensive, and prone to human error. Additionally, early signs of disease can be
difficult to spot with the naked eye, leading to delayed treatment and reduced crop yields
Recent advancements in computer vision and deep learning offer a more efficient
solution. Mask R-CNN, a type of image recognition technology, has shown great promise
in various object detection tasks, including plant disease identification. This technology
can analyze digital images of eggplant leaves, pinpointing areas that might be diseased.
In this study, the researchers will create a dataset of common healthy leaf and eggplant
leaf diseases such as Cercospora leaf spot, Verticillium wilt, and Early blight, and train
2
Statement of the Problem
Eggplant cultivation faces significant challenges due to various leaf diseases such as
Cercospora leaf spot, Verticillium wilt, and Early blight. Traditional methods of detecting
diseases in eggplant plants, like looking at the leaves and checking them manually, are
hard work and can lead to mistakes because it depend on people's opinions and can be
affected by human errors. This process takes a lot of time, requires trained individuals,
and is not easy to do on a large scale. Sometimes, different people may see the same leaf
misidentifying a disease or missing early signs of infection that are hard to see.
In connection with that, the purpose of this study is to create a dataset of healthy leaf
and eggplant leaf diseases such as Cercospora Leaf Spot, Verticillium Wilt and Early
Blight and train the dataset using Mask R-CNN for classification and detection of healthy
Objectives
The main objective of the research is to apply MASK R-CNN for the
classification of eggplant leaf diseases. Specifically, this, study aims to achieve the
following:
To create a new dataset of an eggplant healthy leaf and with diseases such
3
To evaluate the MASK R-CNN performance in terms of precision, recall
MASK R-CNN to classify and detect an eggplant healthy leaf and with diseases.
Eggplant Farmers: The result of this study will help the farmers recognize the diseases
Eggplant Vendors - Vendors in the eggplant supply chain will gain indirectly. Vendors
can have access to a more constant and presumably higher-quality supply of eggplants
through enhanced disease management procedures. This can contribute to a stable and
Consumers: The result of this study helps consumers benefit indirectly since they now
have access to better and potentially more abundant eggplant produce as a result of
findings to improve their support and advice for eggplant farmers. The use of innovative
Future Researchers: They can expand on the results of this study to investigate new
strategies or enhancements in using MASK R-CNN for disease detection. This research
can help to further the development of creative solutions in imaging and agriculture.
4
Scope and Limitations
This study will focus on creating a dataset on eggplant leaves, and training using the MASK
R-CNN to classify an eggplant healthy leaf and with diseases. The study will train the Mask R-
CNN to determine various types of Eggplant leaf such as healthy and leaf diseases such as
Cercospora leaf spot, Verticillium wilt, and Early blight. The researchers will use the Mask R-
CNN that will only classify diseases included in the dataset. New and unknown diseases may
not be recognized.
Definition of Terms
For a better understanding of this study, the following terms are defined in the context of
this research.
Accuracy- is a crucial metric for evaluating the performance of deep learning models. It
helps to determine the effectiveness of a model and its ability to make correct predictions. In
this study, accuracy is used as one of the measures to test the pre-trained model to classify
causing diseases in plants. In this study, "bacteria" refers to the presence of pathogens such
Bacterial Wilt- is a plant disease caused by bacterial pathogens, leading to wilting and death
of the plant. In this study, bacterial wilt referred to a disease in eggplant plants caused by
general, less numerous than them. In this study, classification involves defining product
5
attributes in a manner different from the typing method (Internet Encyclopedia of
Philosophy).
Early blight- caused by Alternaria species, is one of the major diseases in the production of
tomato (Solanum lycopersicum), potato (Solanum tuberosum) and other plants, and is most
prevalent on unfertilized or otherwise stressed plants. In this study entails identifying circular
lesions with concentric rings on the leaves of tomato and potato plants, confirming the
Eggplant leaves- Refer to the foliage of the eggplant (Solanum melongena), which is a
member of the nightshade family. In this study, eggplant leaves will be used as the target
object of detection and classification. The MASK-RCNN will be trained to identify these
Epochs- defines the number times that the learning algorithm will work through the entire
training of dataset. In this study, the researchers will examine how the number of epochs
F1 Score- is a standard metric in various machine learning and data analysis fields. This
study is the harmonic mean of precision and recall (Van Rijsbergen, 1979).
Fungi- are eukaryotic organisms, including molds, mildews, and mushrooms, capable of
causing plant diseases. This study deals with fungal pathogens such as Cerocospora
melangina and Verticillium dahliae, which cause damage eggplant leaves (Alexopoulos et al.,
1996).
abnormalities. In this study, a healthy leaf meets predefined visual criteria such as consistent
6
green color, uniform texture, and absence of visible disease symptoms, as identified by image
Image Annotation - is the process of labeling or tagging images with metadata to describe
the objects or regions within the images. Mask R-CNN is utilized in this study to accurately
detect and classify eggplant leaves within the images as developed on top of Faster R-CNN, a
Leaf Disease- is a kind of phenomenon to the natural growth of a plant which is not only
generated hurdles in agribusiness but is also responsible for hampering the agricultural
production of a country. In this study, "Leaf disease" refers to any pathological condition that
affects the leaves of a plant, disrupting its natural growth process (Sarkar et al., 2023).
Leaf Spot- is a plant disease characterized by the presence of spots on the leaves caused by
fungal or bacterial pathogens. This study defines leaf spot as the obvious indications of
fungal or bacterial infections on eggplant leaves, which appear as separate spots or lesions
(Agrios, 2005).
segmentation and instance segmentation. Mask R-CNN was developed on top of Faster R-
Region-Based Convolutional Neural Network, Mask R-CNN is utilized in this study to detect
and classify eggplant leaves within the images accurately (YOO et al., 2022).
plant roots and causing diseases. This research deals with parasitic worms such as root-knot
7
Overfitting- a modeling error in statistics that occurs when a function is too closely aligned
to a limited set of data points. In this study, the researchers will look at how overfitting
affects the accuracy and reliability of machine learning models (Twin, 2021).
measures the proportion of correctly annotated objects or regions compared to all objects or
regions annotated as positive (i.e., belonging to a specific class or category). In the context of
this study, precision is evaluated to assess the accuracy of the MASK-RCNN model in
Recall - measures the completeness of annotations. In image annotation, recall indicates the
proportion of correctly annotated objects or regions compared to all actual objects or regions
belonging to the class of interest in the image. In this study, recall is used to evaluate the
comprehensiveness of the MASK R-CNN model in detecting all instances of eggplant leaves
Utilizing - refers to the act of making practical or effective use of something, typically to
achieve a specific purpose or goal. In this study, utilizing can be understood as the process of
Viruses- are infectious agents consisting of genetic material (DNA or RNA) enclosed in a
protein coat, capable of causing diseases in plants. This study focuses on mosaic, a disease
caused by viruses that causes different browning or mottling patterns on eggplant leaves
8
CHAPTER II
In this section, the researchers will review literature and studies on utilizing computer
vision and deep learning for agricultural disease detection, particularly a focus on plant disease
Panigrahi et al. (2020) cited that plant diseases significantly reduce agricultural
productivity, posing challenges for farmers in detection and control. Early disease detection is
crucial to prevent further losses. Research has focused on using supervised machine learning
techniques for this purpose, particularly for maize plant disease detection through plant
images. Studies have analyzed and compared methods such as Naive Bayes (NB), Decision
Tree (DT), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest
(RF). Among these, the RF algorithm has shown the highest accuracy at 79.23%, making it the
most effective model for disease prediction. These trained models aim to assist farmers in the
early detection and classification of plant diseases, offering a preventive approach to managing
crop health.
Wani et al. (2021) also cited that plant disease detection is vital for productive agriculture
and a robust economy. Traditional methods for detecting plant diseases are labor-intensive,
time-consuming, and require specialized expertise. Recently, automatic plant disease detection
has emerged as a significant research area, attracting considerable attention from academics,
9
researchers, and practitioners. Machine Learning (ML) and Deep Learning (DL) techniques
diseases. The study focuses on diseases and infections affecting four crop types: Tomato,
Rice, Potato, and Apple. It begins with an examination of the various infections and diseases
associated with these crops, detailing their causes and symptoms. The review then delves into
the steps involved in plant disease detection and classification using ML and DL. The
availability of online datasets for plant disease detection is also discussed, alongside an
These evaluations consider performance metrics, datasets used, and feature extraction
methods. Finally, the review identifies challenges in using ML and DL for plant disease
Deep learning learns the image features and extracts contextual details and global features
that will help in reducing the error remarkably. A deep learning architecture, based on
convolutional neural network (CNN) was used to identify vegetable with minimum color
distortions. For classifying the quality level, CNN select the region (Bhargava & Bansal,
2021).
Computer Vision
According to the study of Sabzi et al. (2020) a computer vision system has been proposed
for the automatic recognition and classification of five varieties of plant leaves under
(quince), 2) Eucalyptus camaldulensis dehn (river red gum), 3) Malus pumila (apple), 4)
10
Pistacia atlantica (Mt. Atlas mastic tree), and 5) Prunus armeniaca (apricot). A total of 516
images of tree leaves were captured, and 285 features were computed for each image. These
features included shape, color, texture based on the gray level co-occurrence matrix,
Seven discriminant features were selected for classification using three classifiers: hybrid
artificial neural network–ant bee colony (ANN–ABC), hybrid artificial neural network–
(LDA). The mean correct classification rates (CCR) achieved were 94.04% for hybrid ANN–
Additionally, the best classifier performance metrics for mean area under the curve (AUC),
mean sensitivity, and mean specificity for the five tree varieties were as follows:
2. Eucalyptus camaldulensis dehn (river red gum): AUC = 1.00 (LDA), Sensitivity = 100%
3. Malus pumila (apple): AUC = 0.996 (LDA), Sensitivity = 96.63% (LDA), Specificity =
94.99% (LDA)
4. Pistacia atlantica (Mt. Atlas mastic tree): AUC = 0.979 (LDA), Sensitivity = 91.71%
5. Prunus armeniaca (apricot): AUC = 0.994 (LDA), Sensitivity = 88.67% (LDA), Specificity
= 94.65% (LDA)
11
This study demonstrates the effectiveness of these classifiers in accurately identifying and
classifying various plant leaves, with LDA showing particularly strong performance for
Mask-RCNN is a deep learning algorithm for object detection and instance segmentation,
building upon Faster-RCNN by predicting both bounding boxes and precise segmentation
masks for each object. Its key innovations include the ROIAlign technique for accurate
spatial information and the addition of a mask head branch for fine-grained pixel-level
segmentation.
The Feature Pyramid Network (FPN) shown in Figure 1 plays a crucial role in Mask
R-CNN's feature extraction. FPN constructs a multi-scale feature pyramid that incorporates
information from different image scales. This allows the model to gain a more
12
particularly important for Mask R-CNN, instance segmentation across a wide range of object
sizes.
Mask R-CNN was proposed by Kaiming He et al. in 2017. It is very similar to Faster R-
CNN except there is another layer to predict segmented. The stage of region proposal
generation is the same in both the architecture; the second stage, which works in parallel,
predicts the class generates a bounding box as well as outputs a binary mask for each Roi.
The architecture of Mask R-CNN is built upon the Faster R-CNN architecture, with the
addition of an extra "mask head" branch for pixel-wise segmentation. The overall
13
Backbone Network
Mask R-CNN leverages a pre-trained Convolutional Neural Network (CNN) like ResNet
or ResNeXt as its backbone. This powerful component extracts high-level features from the
input image, capturing essential characteristics of objects present. To address the challenge of
objects with varying sizes, a Feature Pyramid Network (FPN) is then built on top of the
backbone.
The FPN tackles the challenge of objects with varying sizes by constructing a multi-scale
feature pyramid. This pyramid cleverly combines features from different levels of the
backbone network. High-resolution features from the backbone provide rich semantic
features offer more precise spatial details, essential for accurately pinpointing object
14
Figure 3 demonstrates the architecture of a Feature Pyramid Network (FPN). This
network is designed to efficiently process images and detect objects of varying sizes by
1. Feature Extraction: The backbone network extracts high-level features from the input
image.
2. Feature Fusion: FPN creates connections between different levels of the backbone
semantic information with lower-level feature maps, allowing the model to reuse features at
different scales.
3. Feature Pyramid: The fusion process generates a multi-scale feature pyramid, where each
level of the pyramid corresponds to different resolutions of features. The top level of the
pyramid contains the highest-resolution features, while the bottom level contains the lowest-
resolution features.
The feature pyramid generated by FPN enables Mask R-CNN to handle objects of various
sizes effectively. This multi-scale representation allows the model to capture contextual
information and accurately detect objects at different scales within the image.
The Region Proposal Network (RPN) is a crucial component inherited from Faster R-
CNN. It analyzes the feature map produced by the backbone network and proposes potential
regions of interest (candidate bounding boxes) that might contain objects within the image.
15
Additionally, the RPN goes beyond just proposing boxes. It also predicts the probability of
ROI Align
After the RPN generates region proposals, the ROIAlign (Region of Interest Align) layer
is introduced. This step helps to overcome the misalignment issue in ROI pooling.
ROIAlign plays a crucial role in accurately extracting features from the input feature map for
tasks. The primary purpose of ROIAlign is to align the features within a region of interest
(ROI) with the spatial grid of the output feature map. This alignment is crucial to prevent
information loss that can occur when quantizing the ROI's spatial coordinates to the nearest
16
Figure 5: ROI Align Operation
Figure 5 illustrates the Region of Interest (RoI) Align operation, a crucial component in
1. Input Feature Map: The process begins with the input feature map, which is typically
obtained from the backbone network. This feature map contains high-level semantic
2. Region Proposals: The Region Proposal Network (RPN) generates region proposals
(candidate bounding boxes) that might contain objects of interest within the image.
3. Dividing into Grids: Each region proposal is divided into a fixed number of equal-sized
spatial bins or grids. These grids are used to extract features from the input feature map
4. Bilinear Interpolation: Unlike ROI pooling, which quantizes the spatial coordinates of
the grids to the nearest integer, ROIAlign uses bilinear interpolation to calculate the pooling
17
contributions for each grid. This interpolation ensures a more precise alignment of the
5. Output Features: The features obtained from the input feature map, aligned with each
grid in the output feature map, are used as the representative features for each region
proposal. These aligned features capture fine-grained spatial information, which is crucial for
accurate segmentation.
improves the accuracy of feature extraction for each region proposal, mitigating
misalignment issues. This precise alignment enables Mask R-CNN to generate more accurate
segmentation masks, especially for small objects or regions that require fine details to be
Mask Head
The Mask Head is an additional branch in Mask R-CNN, responsible for generating
segmentation masks for each region proposal. The head uses the aligned features obtained
through ROIAlign to predict a binary mask for each object, delineating the pixel-wise
boundaries of the instances. The Mask Head is typically composed of several convolutional
18
Figure 6: Mask Head Structure
Figure 6 illustrates the Mask Head structure, a component commonly used in object
Related Studies
Several studies have explored diverse approaches for disease detection in eggplants and
Aravind et al. (2019) achieved high accuracy (96.7%) using transfer learning models like
VGG16 and AlexNet for *Solanum Melongena* classification, but their model struggled
with illumination and color variations. Maggay et al. (2020) proposed a mobile-based
system for eggplant disease recognition but only achieved 80-85% accuracy, limiting its
practical use. Xie used spectral and texture features with KNN and AdaBoost for detecting
early blight in eggplant leaves, reaching an accuracy of 88.46%, although HLS images
underperformed.
19
Wei et al. employed infrared spectroscopy for identifying seedlings affected by root knot
nematodes with 90% accuracy, but their method was only effective under normal conditions.
Sabrol et al. used GLCM and ANFIS for eggplant disease classification, achieving 98%
accuracy, though constrained by a small dataset. Wu et al. combined visible and near-infrared
reflectance spectroscopy with neural networks for early disease detection in eggplants,
Li et al.'s work stands out for utilizing Mask R-CNN for precise segmentation, wavelet
transform for feature enhancement, and F-RNet for classification, achieving 98.7% detection
accuracy for disease and insect spots in tea leaves, outperforming other models. These
studies highlight the potential of advanced image processing and machine learning methods
for plant disease detection, while also underscoring challenges related to environmental
highlighted by Herbert Dustin Aumentado and Mark Angelo Balendres. Identifying disease-
causing agents and emerging pathogens is crucial for effective disease management. In
November 2021, a new fungal species, *Diaporthe melongenae* sp. nov., was discovered in
Cavite, Philippines, following an outbreak of leaf blights with a 10%-30% disease incidence,
leading to severe economic losses for farmers. Environmental factors and host characteristics
play a role in the severity of the disease. Besides fungal infections, other major challenges
include fruit and shoot borer infestations, bacterial wilt, irrigation issues, and climate-related
problems. The overuse of pesticides in eggplant farming also raises health and environmental
20
concerns. To address these issues, Local Government Units should prioritize extension
The study by Bhattacharya et al. explores the use of the Mask R-CNN model for
identifying eggplant flowering stages, with techniques that can also be applied to classifying
eggplant leaf diseases. To optimize the model, the researchers combined regular and dilated
convolutions and modified the ResNet-50 backbone, improving the model's ability to capture
fine-grained details over larger regions, such as eggplant flowers. These modifications likely
enhanced feature extraction by expanding the receptive field and learning global feature
relationships. Transfer learning with a pre-trained ResNet-50 model was also used to prevent
overfitting, improving training efficiency and generalization to new data. The optimized
model achieved high accuracy, with a mean Average Precision (mAP) of 0.962 and a mean
Intersection over Union (mIOU) of 0.715, indicating strong object detection and
segmentation performance. This research presents a more effective method for identifying
21
Recognizing apple leaf diseases using a novel parallel real-time processing framework
based on MASK RCNN and transfer learning: An application for smart agriculture
In the study by Rehman et al., a novel parallel real-time processing framework based on
Mask R-CNN and transfer learning is proposed for recognizing apple leaf diseases, similar to
its application in other plant disease recognition tasks. Techniques like hybrid contrast
stretching for data preprocessing and the use of a pre-trained CNN model for feature
extraction could be adapted to improve input data quality and feature extraction in eggplant
leaf disease detection. The study highlights the importance of automated systems for accurate
employing a parallel framework, the system enhances image contrast before using Mask R-
CNN to detect diseased regions. Feature extraction is improved through Kapur’s entropy and
the MSVM method. Tested on the Plant Village dataset, the system achieved an accuracy of
96.6% using the ESDA classifier, outperforming previous methods in apple leaf disease
classification. This approach could be beneficial for real-time processing in smart agriculture
22
Mask R-CNN Refitting Strategy for Plant Counting and Sizing in UAV Imagery
In Machefer et al.’s study, Mask R-CNN is applied to plant counting and sizing in UAV
imagery, a different task but still relevant to plant detection and segmentation. The study
emphasizes the benefits of transfer learning, which could reduce the need for labeled data in
eggplant disease detection by using pre-trained Mask R-CNN models. Machefer et al.
combined remote sensing and deep learning to accurately detect and segment individual plants,
such as potato and lettuce, in aerial images. Fine-tuning Mask R-CNN allowed them to
optimize parameters for the task, and the model outperformed traditional computer vision
methods. The performance was evaluated using mean Average Precision (mAP) for
segmentation and Multiple Object Tracking Accuracy (MOTA) for detection, achieving a mAP
of 0.418 for potato plants and 0.660 for lettuces, and a MOTA of 0.781 for potato plants and
0.918 for lettuces. These results demonstrate the model's effectiveness in both detection and
segmentation tasks, suggesting potential adaptations for improving plant disease detection in
similar applications.
Symptom Recognition of Disease and Insect Damage Based on Mask R-CNN, Wavelet
Li et al.'s study introduces a robust framework for recognizing tea leaf disease and insect
damage symptoms using Mask R-CNN, wavelet transforms, and F-RNet, which holds
relevance for eggplant leaf disease classification research. The framework utilizes Mask R-
CNN for object detection and segmentation, wavelet transforms for feature enhancement, and
F-RNet for classification. This approach addresses the limitations of manual identification
methods, which suffer from low accuracy and efficiency, offering a more precise and scalable
23
solution. In their study, Mask R-CNN successfully segmented 98.7% of disease and insect
spots from tea leaves, ensuring that nearly all affected areas were captured. The two-
dimensional discrete wavelet transform further refined the image features by generating images
with four different frequency representations, which were then input into the Four-Channeled
Residual Network (F-RNet) for classification. The F-RNet model achieved an accuracy of
88%, outperforming other models like SVM, AlexNet, VGG16, and ResNet18, demonstrating
its effectiveness in both disease and pest recognition. This methodology not only provides a
highly accurate solution for identifying tea leaf diseases like brown blight, target spot, and tea
coal, along with pests such as *Apolygus lucorum*, but also has the potential for adaptation to
eggplant leaf disease detection. By integrating these techniques, researchers could improve the
precision and robustness of eggplant disease classification systems, paving the way for more
24
Synthesis
Related research shows that the Mask-RCNN model had a high percentage of
classification accuracy of about 98.7%. Studies also reveal that many methods have been
utilized for classifying and detecting the leaf diseases that are found specifically on
eggplant leaves. However, there is a study about leaf diseases that achieved a correctness
diseases found on eggplant leaves specifically found in the Philippines researchers will
create dataset, annotate the image with labels, pre-process those by resizing and split it
into train, validate and test the Mask-RCNN model. Three types of diseases found on
eggplant leaves in the experiment of the datasets are Cercospora leaf spot, Verticillium
Table 1: This table compares several approaches used in plant disease detection and
Performance
25
Learning)
26
Conceptual Framework
27
Figure 7: Conceptual Framework
28
Conceptual Framework
Figure 7 explains the conceptual framework outlines the steps involving creating dataset of
eggplant leaf. Input dataset images of eggplant Pre-processed the images and annotate them,
later on divide the dataset into two subsets: training set, and validation set using Mask R-CNN.
The training set is used to train the model, the validation set helps the model of
hyperparameters (epoch, Learning rates). Assess the performance of the trained model using
the validation set. Calculate metrics such as precision, recall (sensitivity), and F1_score to
measure the model's effectiveness. If the model's performance is not satisfactory, the
researchers may need to revisit previous steps. Consider adding more diverse data, adjusting
29
CHAPTER III
METHODOLOGY
Research Method
In this study, the researcher’s main objective was to create a dataset of eggplant leaf
diseases and apply the fine-tuned Mask-RCNN model for classifying healthy leaf and leaf
disease leaves. The researcher will have a diverse collection of 4, 000 eggplant leaf
images displaying common leaf diseases such as Cercospora Leaf Spot, Verticillium wilt,
and Early Blight including healthy leaf. Each image will be thoroughly undergone pre-
In this study, different process such as image collection, image pre-processing which
removing background and noises are included, and the Mask R-CNN model will be used
for classification. Throughout the collection of image dataset, samples of eggplant leaf
with each class have been photograph and separated. Second, image pre-processing will
use to remove background and noises and resize images and enhance the Mask R-CNN
model’s performance trained on data. And lastly, evaluating the Mask R-CNN
Additionally, the study utilized Google Colab as the platform for model training and
validation, employing the configurations of learning rate and epochs to assess the
30
Table 2: Sample Images of Eggplant Leaf Diseases
Table 2 provides visual examples of eggplant leaves affected by different diseases. The
images showcase the distinct characteristics associated with each disease, such as color
The researchers will gather a minimum of 1,000 images for each category of eggplant
leaf: Cercospora Leaf Spot, Verticillium Wilt, Early Blight, and Healthy. Sample images
were acquired manually indoors by taking photos using a VivoY100 mobile phone
equipped with 12– 12-megapixel camera with white background. The images were in
31
rotation while taking a picture with natural light and no filter. A total of 1,000 images of
each leaf diseases and healthy leaf were acquired and saved to folders in JPG format.
Before training the Mask R-CNN model, image pre-processing, annotation and collected
images split in train, and validate will be applied. During the training of the Mask R-
CNN, the learning rate and epochs will be adjusted to analyze the performance results
Figure 8 includes three images that likely represent different stages of capturing
eggplant leaf disease samples. These images may showcase varying angles, lighting
Variable Settings
Flash No flash
32
Table 3 outlines the specific camera settings used during the image acquisition
process. The images were captured at a resolution of 4080 x 2296 pixels, with the flash
turned off. The images were saved in JPEG format, a widely used image file type that
Data Pre-processing
During training the MASK R-CNN model, the researchers need to get the images
ready for analysis. The researchers will resize them to a standard size, usually around
244x244 pixels, so they are all the same and easier for the model to handle. The
researchers are going to have an output for their pre-processed image on this such as
resize. Another important part is data augmentation, where the researchers make some
slight changes to their images, like rotating or flipping them, to give our model more
examples to learn from. The researchers are going to have an output for their pre-
processed image which is resized as 244*244 pixel on this. Finally, the researchers will
need to make sure their annotations, which mark out the diseased areas on the leaves,
Figure 9 illustrates the data pre-processing pipeline. The process begins with an input
33
annotation is performed to label or mark relevant objects or regions within the image. It
Early blight
Verticillium wilt
Finally, the pre-processed output, which is a modified version of the input image, is
generated.
Table 4 compares original images of eggplant leaves with their corresponding pre-
processed versions.
34
Data Splitting
In this study, the researchers divided the collected images into training, and validation.
Training set is for the model to extract features and learn weights. The researchers took 80%
of the data for training. It helps prevent overfitting and ensures that the model generalizes well
to unseen data. This portion is used to train the model on examples of eggplant leaf diseases,
allowing it to learn to recognize patterns and features associated with different diseases. There
Validation set is for evaluating the model under training and is the initiator of changing the
model parameters to enhance performance. The researchers use 30% of the data for validation.
The model's performance metrics, such as score, precision, and recall, are computed based on
its predictions on the validation set. There are a total of 1,200 sample images for validation.
Mask R-CNN
The researchers used Mask R-CNN to classify, and segment healthy and diseased areas on
eggplant leaves. It extends a model called Faster R-CNN by adding a feature to create masks
The model starts with a backbone network like ResNet, which extracts important features
from the images. These features are then used by a Region Proposal Network (RPN) to suggest
possible regions where objects (diseased or healthy leaf areas) might be located. These regions
are aligned accurately with the original image using a process called RoI Align.
For each region, the model predicts a bounding box (the area around the object) and
classifies it as either healthy or one of the disease types (Cercospora Leaf Spot, Verticillium
35
Research Instrument
The confusion matrix is a tool used to evaluate the performance of the model, particularly in
problem. It shows the ways in which the researcher’s classification model is confused when it
makes predictions. It shows four types of results when the model makes predictions. The matrix
is structured as follows:
the context of eggplant leaf disease classification. It helps to assess how accurately the
The confusion matrix is in the form of a square matrix where the column represents the actual
values, and the row depicts the predicted value of the model and vice versa.
TP: True Positive: The actual value was positive, and the model predicted a positive value
FP: False Positive: Your prediction is positive and false. (Also known as the Type 1 error)
FN: False Negative: Your prediction and the result are also false. (Also known as the Type 2
error)
TN: True Negative: The actual value was negative, and the model predicted a negative value
36
The study used the following performance metrics: precision, sensitivity, and F1_score.
Based on the article of Karimi (2021), the following formula is used to calculate the metrics:
Precision
Measures how many of the positive predictions made by the model were actually correct.
In other words, it assesses the model's ability to avoid false positives. A high precision
indicates that the model is good at identifying only relevant instances (Géron, A. 2019). The
Recall (sensitivity)
Measures how many of the actual positive instances the model was able to correctly
identify. It assesses the model's ability to avoid false negatives. A high recall indicates that the
model is good at capturing most of the positive instances in the dataset (Géron, A. 2019). The
37
F1 Score
is a harmonic mean of precision and recall, providing a single metric that balances the
importance of both. It is particularly useful when there is an imbalance between the positive
and negative classes in the dataset. A high F1-score indicates that the model is performing
well in terms of both precision and recall (Géron, A. 2019). The formula for calculating the
statistical analysis, modeling, and interpretation. By following these steps and employing
various techniques, analysts can uncover valuable insights and information from data,
High Recall: Recall values ranging from 0.91 to 1.0 indicate a high level of recall, signifying
the model's ability to accurately identify the majority of true negative instances.
Moderate Recall: Recall values falling between 0.71 and 0.9 suggest moderate recall, showing
Moderately Low Recall: Recall values between 0.51 and 0.7 indicate moderately low recall,
suggesting that the model has some effectiveness in identifying negative instances.
38
Low Recall: Recall values below 0.5 represent low recall, indicating that the model is
High Precision: A value ranging from 0.91 to 1.0 signifies high precision, indicating that the
Moderate Precision: Falling between 0.71 and 0.9 a value in this range suggests moderate
precision, showing that the model is reasonably accurate in predicting positive instances.
Moderately Low Precision: A value between 0.51 and 0.7 indicates moderately low
precision, suggesting that the model is somewhat effective in identifying positive instances.
39
Low Precision: A value below 0.5 or below represents low precision, indicating that a
High F1_score: A value between 0.91 to 1.0 indicates high model’s performance is
Moderate F1 score: A value between 0.71 and 0.9 suggests moderate precision, indicating the
model’s performance is solid, indicating a good balance between precision and recall.
Moderately Low F1-score: A value between 0.51 and 0.7 this suggests the model’s
Low F1-score: A value below 0.5 indicates low the model’s performance is poor, showing
some ability to balance precision and recall but still not reliable. (Sklearn.metrics). Table 8
40
F1_score Rate F1_score Rate Interpretation
Data Annotation
The researchers will annotate the image of an eggplant leaf with diseases and a healthy
eggplant leaf. For each image, a ground truth labeled image was manually generated
containing the individual segmentation of diseases found in an eggplant leaf in the image. The
image annotation tool to use to produce a ground truth mask is VGG Image Annotator (VIA).
VIA is developed by the Visual Geometry Group at the University of Oxford that supports a
wide variety of annotations. The specific operation was to define all diseases within an
eggplant by marking the area of the diseased leaf using a bounding box and then labeling the
leaf according to its names (Healthy, Early blight, Cercospora leaf spot, and Verticillium wilt).
The annotations obtained were saved on json format with their corresponding images to be
Ethical Considerations
The researchers used their own image dataset, no copyright infringement took place. The
41
Hyperparameter Optimization
Hyperparameters are important because they can have a big impact on model training as it
relates to training time, infrastructure resource requirements, model convergence and model
accuracy (Kasture, 2020). Selection of the right machine learning model and the
corresponding correct set of hyperparameter values are very crucial since they are the key for
the model’s overall performance (Kumar, 2021). Optimization of the hyperparameters will be
done through manual search wherein combinations of hyperparameter values will be tested,
and train the model for each combination, then pick the one that gives the best result. Below
Learning rate: the learning rate is a hyperparameter that controls how much to change the
model in response to the estimated error each time the model weights are updated. Choosing a
too small value may result in a long training process that could get stuck, whereas choosing a
too large value may result in learning a sub-optimal set of weights too fast or an unstable
Epoch: the epoch indicates the number of passes of the entire training dataset the machine
learning algorithm has completed (Bell, 2020). Too many epochs can lead to overfitting of the
training dataset, whereas too few may result in an under fit model (Brownlee, 2018).
42
CHAPTER IV
The researchers have collected a dataset of eggplant leaf by capturing and collecting
images specifically focused on these leaf diseases, by removing the background from images
is a common image processing task. By analyzing the color distribution within the eggplant
leaf images, we categorized the eggplant leaf into four classes: healthy, early blight,
cercospora leaf spot, and verticillium wilt. These pre-processing steps are often necessary for
subsequent analysis or machine learning tasks involving image datasets. The said architecture
was configured on the hyperparameters such as learning rate and epoch. Several values of
learning rate and epoch was tested to select which best for classifying eggplant leaf disease
based on precision, sensitivity, and F1_score. The Mask R-CNN architecture was utilized to
To prepare a dataset for training a model with Mask R-CNN, organize labeled images of
eggplant leaf diseases and healthy leaves into separate folders. Set up Google Colab, upload
the dataset using unzip, and install libraries and frameworks like TensorFlow via pip install.
Utilize appropriate data loading and preprocessing techniques for Mask R-CNN. The sample
43
Table 9: Training results of Mask R-CNN architecture based on precision, recall, and F1-
Score of healthy leaf.
The Mask R-CNN architecture was successfully configured For the learning rate of 0.01,
the model shows strong results across different epochs, with a consistent recall of 1.00,
indicating that it captures all positive cases. Precision ranges from 0.65 to 0.75, which
suggests that while the model predicts most positives correctly, there are some false positives.
As epochs increase, the F1-score improves, peaking at 0.86 for 20 epochs. This learning rate
provides a good balance between learning speed and accuracy, with 20 epochs being the most
With a learning rate of 0.001, the model requires more epochs to show improvement but still
struggles with lower precision and recall compared to 0.01. Precision starts low and improves
slightly with more epochs, reaching only 0.33 at 30 epochs, while recall hits 1.00 at that point.
This suggests that the model is highly sensitive to detecting positives but also misclassifies a
lower precision and variable performance indicates that this learning rate may be too low for
44
At a learning rate of 0.0001, the model performs poorly overall, with low precision and F1-
scores even as epochs increase. While recall reaches 1.00 by 30 epochs, precision remains
very low, ranging from 0.16 to 0.25, suggesting a significant amount of overfitting with many
false positives. This learning rate is too low to effectively capture patterns in the data within
the 10 to 30 epoch range, making it unsuitable for the current training setup.
Table 10: Training results of Mask R-CNN architecture based on precision, recall, and
F1-score of Cercospora leaf spot.
The Mask R-CNN architecture with a learning rate of 0.01, the model shows a moderate
and consistent performance across 10, 20, and 30 epochs, with precision, recall, and F1-
scores hovering around 0.45 to 0.5. This steadiness suggests that the model may not be
effectively learning with this rate, as increasing the epochs does not yield noticeable
improvements, indicating possible underfitting where the model fails to capture sufficient
complexity in the data. By contrast, at a learning rate of 0.001, the model performs poorly at
10 and 20 epochs with low precision and recall, resulting in low F1-scores around 0.35.
However, at 30 epochs, recall jumps dramatically to 1.0, but precision remains low at 0.33,
yielding an F1-score of 0.49. This suggests that the model is overfitting by learning too
many positives while misclassifying negatives, making this rate too low to achieve a
45
balanced performance. Finally, at a learning rate of 0.0001, the model shows steady
improvement across epochs, reaching the best balance at 30 epochs with a precision of 0.73,
recall of 0.57, and F1-score of 0.57. This gradual improvement across metrics indicates that
the slower learning rate enables the model to capture data patterns more effectively over
time, allowing for better generalization without overfitting, making it the most effective
Table 11: Training results of Mask R-CNN architecture based on precision, recall, and
F1-score of Early blight class.
precision, recall, and F1-score when identifying early blight. With a learning rate of 0.01, the
model shows high precision across epochs but varying recall and F1-score results. At 10
epochs, precision is 0.92, recall is 0.55, and F1-score is 0.69, indicating that the model
correctly identifies many true positives but has moderate recall, resulting in an intermediate
F1-score. Increasing to 20 epochs, the precision rises slightly to 0.93, with a notable
improvement in recall to 0.70, yielding a higher F1-score of 0.80. However, at 30 epochs, the
precision reaches a perfect 1.00, but recall drops sharply to 0.32, lowering the F1-score to
46
0.48. This suggests overfitting, where the model confidently predicts fewer instances, thereby
lowering recall.
With a learning rate of 0.001, the model’s performance is generally less consistent and
slightly lower across metrics. At 10 epochs, precision, recall, and F1-score are lower at 0.39,
0.35, and 0.37, respectively, suggesting limited learning. After 20 epochs, precision improves
to 0.90, but recall is still relatively low at 0.50, producing an F1-score of 0.60. By 30 epochs,
precision decreases to 0.44, and recall falls to 0.21, resulting in an F1-score of 0.29, indicating
that the model struggles to balance between true positive predictions and overall sensitivity.
With a smaller learning rate of 0.0001, the model initially exhibits moderate performance but
0.69, recall is 0.45, and the F1-score is 0.57, indicating a relatively balanced but moderate
result. At 20 epochs, performance drops, with precision falling to 0.50 and recall plummeting
to 0.05, resulting in a very low F1-score of 0.09, which suggests ineffective learning. By 30
epochs, the model shows high precision at 0.94, but recall remains low at 0.35, producing a
high F1-score of 0.89, which may indicate some improvements in prediction certainty but a
47
Table 12: Training results of Mask R-CNN architecture based on precision, recall, and
F1-Score of Verticillium wilt.
The Mask R-CNN architecture was successfully configured and the values that gives best
result for training eggplant leaf disease, Verticillium wilt class are 20 and 0.01 for epoch and
learning rate, respectively. With a learning rate of 0.01, the model shows strong performance
across all epoch levels. At 10 epochs, it achieves a precision of 0.83 and a recall of 1.00,
resulting in an F1-score of 0.91. This indicates that the model is able to detect all early blight
instances, with some minor inaccuracies. When the training reaches 20 epochs, precision
improves to 0.94 while maintaining a perfect recall, leading to an F1-score of 0.97, which
slightly decreases to 0.88, but the recall remains at 1.00, resulting in an F1-score of 0.94.
This performance indicates that while increasing the epochs can enhance accuracy, the
improvement starts to plateau, suggesting that further training may not yield significant
gains.
With a learning rate of 0.001, the model experiences difficulties in achieving a high balance
between precision and recall. At 10 epochs, the model has a precision of 0.80 but a recall of
48
only 0.27, yielding an F1-score of 0.40, which suggests that while it avoids false positives, it
misses most early blight cases. As the training progresses to 20 epochs, recall improves to
1.00, but this comes at the cost of precision, which drops to 0.37, resulting in an F1-score of
0.54. This change implies the model becomes overly sensitive and misclassifies many cases
as early blight. At 30 epochs, the precision and recall become more balanced at 0.38 and
0.30, respectively, with an F1-score of 0.45. Overall, at this learning rate, the model struggles
to achieve strong performance in detecting early blight, even with extended training.
At a learning rate of 0.0001, the model requires more epochs to reach higher levels of
accuracy. At 10 epochs, it has a precision of 0.84 and a recall of 0.37, resulting in an F1-
score of 0.53, which shows moderate precision but low sensitivity to detecting early blight
cases. Increasing to 20 epochs, the precision slightly decreases to 0.81, with a recall of 0.40
epochs, the model achieves a significant boost in both precision and recall, reaching 0.95 and
0.90, respectively, and yielding an F1-score of 0.93. This suggests that with enough training,
a low learning rate can eventually yield balanced and high performance, though it requires a
CHAPTER V
The researchers created their own datasets for classifying of an eggplant leaf such as
healthy, and with diseases such as cercospora leaf spot, early blight, and verticillium wilt,
49
that involves collecting, resizing, and annotating a diverse set of images, splitting images
into training, and validation, training a model using machine learning techniques,
specifically, deep learning model, evaluating its performance through hyperparameters and
The Mask R-CNN model demonstrated varying performance across across different leaf
disease classes. For Early Blight, the highest F1-score achieved was 0.89, while Verticillium
wilt, it reached 0.94. However, the model’s performance was sensitive to learning rate and
combinations. Additionally, trade-off between precision and recall was evident in some
cases. This implies that the said Mask R-CNN model can correctly classify the healthy leaf
The results highlight the effective performance of Mask R-CNN in classifying eggplant
leaves as healthy or diseased. For Cescospora Leaf Spot, Verticilium Wilt, and Early Blight,
the model achieved the F1-scores ranging from 0.37 to 0.94. Precision and recall sensitive to
hyperparameters like learning rate and epochs. While the model showed promising results
for optimization is needed to improve performance for Early Blight and Cercospora Leaf
Spot. To improve Verticillium wilt, further the model’s overall performance and
Recommendations
50
Although this research has shown the potential of Mask R-CNN for detecting eggplant
leaf diseases, there are still opportunities for future studies to improve the model's
of eggplant leaf diseases to enhance the model's ability to generalize and its accuracy. By
including pictures of uncommon or new diseases, the algorithm can learn to identify a
Additionally, creating a mobile application that is easy for users to navigate could
greatly transform disease detection within agricultural environments. This type of app has
the potential to use smartphone capabilities to snap pictures of damaged leaves, enabling
farmers to get instant diagnosis and treatment advice on their devices. This would enable
Moreover, delving into additional cutting-edge deep learning approaches like YOLOv8
or EfficientDet may enhance the model's accuracy, speed, and efficiency. These cutting-
edge algorithms are recognized for their capability to identify objects instantly, which
makes them perfect for mobile apps. Furthermore, implementing hardware optimization
methods, such as making use of specialized hardware like TPUs or GPUs, could greatly
speed up the inference process, leading to quicker and more effective disease detection.
In summary, this study provides a strong basis for future progress in identifying
diseases in eggplant leaves. By acknowledging the constraints of the present study and
exploring the suggested research paths, we can create stronger and more feasible
solutions to assist farmers in safeguarding their crops and guaranteeing food security
51
52
References
Ansari, A. M., Hasan, W., & Prakash, M. (Eds.). (2021). Solanum Melongena:
Production,Cultivation and Nutrition. Nova Science Publishers.Applied With
Different Levels of Chicken Dung; EnvironmentAsia 13 (Special issue)
(2020)81-86
Aumentado, H. D., & Balendres, M. A. (2023). Diaporthe melongenae sp. nov, a new
fungal causing leaf blight in eggplant. Journal of Phytopathology. Advance
online publication. https://doi.org/10.1111/jph.13246
Bhattacharya, S., Banerjee, A., Ray, S., Mandal, S., & Chakraborty, D. (2022,
September). An Advanced Approach to Detect Plant Diseases by the Use of
CNN Based Image Processing. In International Conference on Innovations in
Computer Science and Engineering (pp. 467-478). Singapore: Springer Nature
Singapore.
Chakravarty, A., Jain, A., & Saxena, A. K. (2022, December). Disease Detection of
Plants using Deep Learning Approach—A Review. In 2022 11th International
Conference on System Modeling & Advancement in Research Trends (SMART)
(pp. 1285-1292). IEEE.
Cheng, L., Li, J., Duan, P., & Wang, M. (2021). A small attentional YOLO model for
landslide detection from satellite remote sensing images. Landslides, 18(8),
2751-2765.
Chowdhury, M. E., Rahman, T., Khandakar, A., Ayari, M. A., Khan, A. U., Khan, M.
S., ... & Ali, S. H. M. (2021). Automatic and reliable leaf disease detection using
deep learning techniques. AgriEngineering, 3(2), 294-312.
Dimkić, I., Janakiev, T., Petrović, M., Degrassi, G., & Fira, D. (2022). Plant-
associated Bacillus and Pseudomonas antimicrobial activities in plant disease
suppression via biological control mechanisms-A review. Physiological and
Molecular Plant Pathology, 117, 101754.
Dr. Jayanthi, M. G., & et al. (2022). Eggplant leaf disease detection and
segmentation using adaptively regularized multi-Kernel-Based Fuzzy C-Means
53
and Optimal PNN classifier.Indian Journal of Computer Science and
Engineering (IJCSE), 13(5), Sep-Oct.
Haque, M. R., & Sohel, F. (2022). Deep network with score level fusion and
inference-based transfer learning to recognize leaf blight and fruit rot diseases of
eggplant. Agriculture, 12(8), 1160.
Jindo, K., Evenhuis, A., Kempenaar, C., Sudré, C. P., Zhan, X., Teklu, M.
G., & Kessel, G. (2021, February 27). Review: Holistic pest management
against early blight disease towards sustainable agriculture. Pest Management
Science (Print). https://doi.org/10.1002/ps.6320.
Jin X., Che J., Chen Y., 2021. Weed identification using deep learning and image
processing in vegetable plantation. IEEE Access Volume 9, Issue 1,pp 10940–
10950.https://doi.org/10.1109/ACCESS.2021.305029.
Kaniyassery, A., Goyal, A., Thorat, S. A., Rao, M. R., Chandrashekar, H. K., Murali,
T. S., & Muthusamy, A. Association of Meteorological Variables with Leaf Spot
and Fruit Rot Disease Incidence in Eggplant and Ai-Based Disease
Classification. Available at SSRN 4555881.
Kaniyassery, A., Thorat, S. A., Kiran, K. R., Murali, T. S., & Muthusamy, A. (2023).
Fungal diseases of eggplant (Solanum melongena L.) and components of the
disease triangle: a review. Journal of Crop Improvement, 37(4), 543-594.
Khan, A. I., & Al-Habsi, S. (2020, January 1). Machine Learning in Computer Vision.
Procedia Computer Science. https://doi.org/10.1016/j.procs.2020.03.355
Kashyap, G. S., Kamani, P., Kanojia, M., Wazir, S., Malik, K., Sehgal, V. K., &
Dhakar, R. (2024). Revolutionizing Agriculture: A Comprehensive Review of
Artificial Intelligence Techniques in Farming.
Li, H., Shi, H., Du, A., Mao, Y., Fan, K., Wang, Y., ... & Ding, Z. (2022). Symptom
recognition of disease and insect damage based on Mask R-CNN, wavelet
transform, and F-RNet. Frontiers in Plant Science, 13, 922797.
Lippi, M., Bonucci, N., Carpio, R. F., Contarini, M., Speranza, S., & Gasparri, A.
(2021, June). A yolo-based pest detection system for precision agriculture. In
2021 29th Mediterranean Conference on Control and Automation (MED) (pp.
342-347). IEEE.
54
Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., ... & Yang, J. (2020). Generalized
focal loss: Learning qualified and distributed bounding boxes for dense object
detection. Advances in Neural Information Processing Systems, 33, 21002-
21012.
Ma, W., Wang, X., Qi, L., & Zhang, D. (2019). Identification of Eggplant Young
Seedlings Infected by Root Knot Nematodes Using Near Infrared Spectroscopy.
In Computer and Computing Technologies in Agriculture X: 10th IFIP WG 5.14
International Conference, CCTA 2016, Dongying, China, October 19–21, 2016,
Proceedings 10 (pp. 93-100). Springer International Publishing.
Matsuzaka, Y., & Yashiro, R. (2023). AI-based computer vision techniques and expert
systems. AI, 4(1), 289-302.
Melicano III, M. (2024, March 7). Eggplant Leaf Diseases. Personal interview.
Nasution, S. W., & Kartika, K. (2022). Eggplant Disease Detection Using Yolo
Algorithm Telegram Notified. International Journal of Engineering, Science
and Information Technology, 2(4), 127-132.
Ngugi, L. C., Abelwahab, M., & Abo-Zahhad, M. (2021). Recent advances in image
processing techniques for automated leaf pest and disease recognition–A review.
Information processing in agriculture, 8(1), 27-51.
Smith, M. (2019). Mastering APIs: Design and Build High-Quality Web APIs.
O'Reilly Media.
Terven, J., Córdova-Esparza, D. M., & Romero-González, J. A. (2023). A
comprehensive review of yolo architectures in computer vision: From yolov1 to
yolov8 and yolo-nas. Machine Learning and Knowledge Extraction, 5(4), 1680-
1716.
55
Viswanath, K. K., Varakumar, P., Pamuru, R. R., Basha, S. J., Mehta, S., & Rao, A. D.
(2020). Plant lipoxygenases and their role in plant physiology. Journal of Plant
Biology, 63, 83-95.
Wu, D.; Lv, S.; Jiang, M.; Song, H. Using channel pruning-based YOLO v4 deep
learning algorithm for the real-time and accurate detection of apple flowers in
natural environments. Comput. Electron. Agric. 2020, 178, 105742.
Wu, D., Feng, L., Zhang, C., & He, Y. (2008). Early detection of Botrytis cinerea on
eggplant leaves based on visible and near-infrared spectroscopy. Transactions of
the ASABE, 51(3), 1133-1139.
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020, April). Distance-IoU
loss: Faster and better learning for bounding box regression. In Proceedings of
the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 12993-
13000).
Machefer, M., Lemarchand, F., Bonnefond, V., Hitchins, A., & Sidiropoulos, P.
(2020). Mask R-CNN refitting strategy for plant counting and sizing in UAV
imagery. Remote Sensing, 12(18), 3015.
Rehman, Z. U., Khan, M. A., Ahmed, F., Damaševičius, R., Naqvi, S. R., Nisar, W., &
Javed, K. (2021). Recognizing apple leaf diseases using a novel parallel real‐
time processing framework based on MASK RCNN and transfer learning: An
application for smart agriculture. IET Image Processing, 15(10), 2157-2168.
56
APPENDICES
Appendix A.
57
Appendix B.
Healthy
58
Early Blight
Verticillium wilt
59
Appendix C.
60
Validation result of Mask R-CNN at 0.001 Learning Rate and 10 Epochs
61
Validation result of Mask R-CNN at 0.0001 Learning Rate and 10 Epochs
62
Validation result of Mask R-CNN at 0.01 Learning Rate and 20 Epochs
63
Validation result of Mask R-CNN at 0.001 Learning Rate and 20 Epochs
64
Validation result of Mask R-CNN at 0.0001 Learning Rate and 20 Epochs
65
Validation result of Mask R-CNN at Learning Rate 0.001 and 30 Epochs
66
67
Appendix D. Source Code
import os
import cv2
import numpy as np
import tkinter as tk
from tkinter import filedialog
import matplotlib.pyplot as plt
import torch
from torchvision.ops import nms
from torchvision.transforms import v2 as T
from torchvision.utils import draw_bounding_boxes, draw_segmentation_masks
onnx_file_path = "./checkpoint.onnx"
# Load the model and create an InferenceSession
session = ort.InferenceSession(onnx_file_path, providers=['CPUExecutionProvider'])
def get_transform(train):
transforms = []
if train:
transforms.append(T.RandomHorizontalFlip(0.5))
transforms.append(T.ToDtype(torch.float, scale=True))
transforms.append(T.ToPureTensor())
return T.Compose(transforms)
def open_img():
x = openfilename()
image_input = cv2.imread(x)
image_input = cv2.resize(image_input,(512, 512))
eval_transform = get_transform(train=False)
# Run inference
model_output = session.run(None, {"input": input_tensor_np})
68
scores = torch.Tensor(model_output[2])
idx = torch.argmax(scores)
print(model_output[2])
print(model_output[1])
pred_boxes = torch.Tensor(np.array([model_output[0][idx]]))
labels = torch.Tensor([model_output[1][idx]]).to(torch.uint8)
scores = torch.Tensor([model_output[2][idx]])
masks = torch.Tensor(model_output[3][idx])
plt.figure(figsize=(8, 8))
plt.imshow(output_image.permute(1, 2, 0))
plt.show()
def openfilename():
file = filedialog.askopenfilename(title ='"pen')
if file:
filepath = os.path.abspath(file)
return str(filepath)
if __name__ == "__main__":
root = tk.Tk()
root.title("Image Loader")
root.geometry('400x200')
root.area = [(20, 400), (10, 480), (1020, 480), (1020, 400)]
root.resizable(width = 100, height = 100)
btn = tk.Button(root, text ='Open Image', command = open_img).grid( row = 9, columnspan = 9)
root.mainloop()
69
Appendix E. Documentary of Eggplant leaf
70
Appendix F. Curriculum Vitae
EDUCATION
REFERENCES
71
CABRILLOS, SANNY M.
Igcabito-on, Miag-ai, Iloilo
09657191713
[email protected]
PERSONAL DATA
AGE: 24
SEX: Male
BIRTHDAY: January 01,2000
CIVIL STATUS: Single
NATIONALITY: Filipino
RELIGION: Catholic
EDUCATION
REFERENCES
72
FORTIN, JENNY ROSE S.
Trinidad, San Remigio, Antique
09692443028
[email protected]
PERSONAL DATA
AGE: 22
SEX: Female
BIRTHDAY: February 02, 2002
CIVIL STATUS: Single
NATIONALITY: Filipino
RELIGION: Catholic
EDUCATION
REFERENCES
73