0% found this document useful (0 votes)
17 views83 pages

For - Final Checking v.3

Uploaded by

etomak yohak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views83 pages

For - Final Checking v.3

Uploaded by

etomak yohak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 83

Utilizing Mask-RCNN for the Classification of Eggplant Leaf Diseases

An Undergraduate Thesis Presented to


the Faculty of the College of Computer Studies
University of Antique – Main Campus

In partial fulfillment
of the requirements for the degree
Bachelor of Science in Computer Science

Hermie Rose G. Cabrillos


Sanny M. Cabrillos
Jenny Rose S. Fortin

July 2024
APPROVAL SHEET

This thesis entitled “UTILIZING MASK-RCNN FOR THE CLASSIFICATION


OF EGGPLANT LEAF DISEASES” prepared and submitted by HERMIE ROSE G.
CABRILLOS, SANNY M. CABRILLOS, AND JENNY ROSE S. FORTIN in partial
fulfilment of the requirements for the degree in Bachelor of Science in Computer
Science, has been examined and is recommended for acceptance and approval for Oral
Examination.

RYAN ERCEL O. PADERES


Adviser

Approved by the Committee on Oral Examination on July 03, 2024.

ROSARIE G. SANCHEZ FEMA ROSE B. ECRAELA


Member Member

VERA PANAGUITON
Member

ACCEPTED and APPROVED in partial fulfilment of the requirements for the degree
Bachelor of Science in Computer Science.

RICARDO A. CATANGHAL JR., DIT


Dean, College of Computer Studies

ii
ABSTRACT
Eggplant (Solanum melongena) is considered one of the leading and profitable vegetable

crops grown by farmers (Philippine Statistics Authority, 2023). It is also one of the most

popular fruit vegetables in the Philippines and is one of the important crops, but it is

susceptible to serious diseases that hinder its production (Martins et al., 2024). In 2020, the

annual eggplant production in Western Visayas was ranked fifth in the Philippines,

contributing to 7.05% of the total eggplant yield in the country. Spanning across 1,605

hectares of cultivated land, the region achieved a remarkable yield of 10,664 kilograms per

hectare (Philippine Statistics Authority, 2021).

The dataset of eggplant leaves was manually annotated to meet the model’s training

requirements. The model was trained using a dataset of annotated eggplant leaf images,

categorized into healthy, Cercospora leaf spot, Verticillium Wilt, and Early blight. The best

performance was achieved for Verticillium wilt, with an F1_score of 0.94. For Early blight

and Cercospora leaf spot, the model’s performance was satisfactory but could be further

improved. The results demonstrate the model’s effectiveness in distinguishing between

healthy and diseased leaves, but further optimization is necessary to enhance accuracy and

robustness, especially for Early blight and Cescospora leaf spot. This research contributes to

the development of advanced disease detection tools for eggplant farmers, enabling them to

make informed decisions for sustainable crop management.

iii
ACKNOWLEDGEMENT

Words cannot express our deep gratitude to our adviser, Mr. Ryan Ercel Paderes, for

his invaluable patience, insightful feedback, and continuous encouragement throughout this

journey. His dedication and commitment to our success have been unwavering, and we are

forever grateful for his guidance.

We are also immensely thankful to our defense committee, whose generosity in sharing their

knowledge and expertise was instrumental in the completion of this thesis. Their

constructive criticism and invaluable suggestions helped us refine our work and push the

boundaries of our research.

A special note of thanks goes to our thesis writing II instructor, Mr. John Jowil Orquia, for

his untiring support and mentorship throughout the thesis writing process. His profound

knowledge, coupled with his patience and understanding, has been a cornerstone of our

success. His guidance not only helped shape this thesis but also encouraged us to strive for

excellence.

We extend our heartfelt thanks to Mr. Ricardo Catanghal, PhD, Dean of the College of

Computer Studies, for his comments and suggestions during the defense that help us

understand it further We are also grateful to Dr. Sammy V. Militante, and Dr. Renecynth

Juarigue, Mrs. Fema Rose B. Ecraela, Mrs. Rosarie G. Sanchez, and Ms. Vera Panagution

for their invaluable contributions as members of our thesis committee. Their expert

feedback and insightful suggestions played a crucial role in the development and refinement

of our research, ultimately contributing to the successful completion of this thesis.

iv
We recognize that this accomplishment is not ours alone, but a culmination of the support,

guidance, and wisdom of our advisors, and committee members. Above all, our most

profound thanks go to Almighty God, whose divine guidance and blessings have been our

source of strength and inspiration throughout this journey. Without His grace, this

achievement would not have been possible.

v
TABLE OF CONTENTS

TITLE PAGE..................................................................................................................

APPROVAL SHEET......................................................................................................ii

ABSTRACT....................................................................................................................iii

ACKNOWLEDGEMENT..............................................................................................iv

LIST OF FIGURES........................................................................................................viii

LIST OF TABLES..........................................................................................................ix

LIST OF FORMULAS...................................................................................................x

CHAPTER I INTRODUCTION

Background of the Study................................................................................................1

Statement of the Problem................................................................................................2

Objectives.......................................................................................................................3

Significance of the Study................................................................................................3

Scope of Limitation of the Study....................................................................................4

CHAPTER II REVIEW OD RELATED LITERATURE

Related Studies................................................................................................................19

Synthesis.........................................................................................................................25

Conceptual Framework...................................................................................................28

vi
CHAPTER III METHODOLOGY

Research Method............................................................................................................29

Research Instrument........................................................................................................35

Data Analysis Procedure.................................................................................................37

Ethical Consideration......................................................................................................41

Hyperparameter Optimization........................................................................................41

CHAPTER IV RESULTS AND DISCUSSION

Results and Discussions.............................................................................................42

CHAPTER V CONCLUSION AND RECOMMENDATION

Conclusion....................................................................................................................47

Recommendation..........................................................................................................48

REFERENCES..................................................................................................................49

APPENDICES...................................................................................................................53

A. Eggplant Leaf Resizing ................................................................................................53

B. Eggplant Leaf Dataset .................................................................................................54

C. Validation result of Mask R-CNN at Learning Rate 0.01, 0.001, and 0.0001

in 10, 20, and 30 Epochs.............................................................................................56

D. Source Code..................................................................................................................61

E. Documentary of Eggplant Leaf Rotating......................................................................63

F. Curriculum Vitae............................................................................................................64

vii
LIST OF FIGURES

Figures

1 Mask R-CNN: Overview 12

2 Mask R-CNN Architecture 13

3 Feature Pyramid Network (FPN) backbone 14

4 Region Proposal Network 16

5 ROI Align Operation 17

6 Mask Head Structure 19

7 Conceptual Framework 27

8 Camera Control Setting During Image Acquisition 31

9 Data Pre-Processing 32

viii
LIST OF TABLES

Table

1 Related Studies Comparison 26

2 Sample Images of Eggplant Leaf Diseases 30

3 The camera control setting during image acquisition 31

4 Sample Images of Pre-processed Images 33

5 Confusion Matrix 35

6 Recall (Sensitivity) Interpretation of Results 38

7 Precision Interpretation of Results 39

8 F1-Score Interpretation of Results 40

9 Training results of Mask R-CNN architecture based on precision, recall,

and F1-Score of Healthy leaf 43

10 Training results of Mask R-CNN architecture based on precision, recall,

and F1-score of Cercospora leaf spot 44

11 Training results of Mask R-CNN architecture based on precision, recall,

and F1-score of Early blight class 45

12 Training results of Mask R-CNN architecture based on precision, recall,

and F1-Score of Verticillium wilt 46

ix
LIST OF FORMULAS

Formula

1 Precision Formula 35

2 Sensitivity Formula 37

3 F1_score Formula 37

x
CHAPTER I

INTRODUCTION

Background of the Study

Eggplant (Solanum melongena) is considered one of the leading and profitable

vegetable crops grown by farmers (Philippine Statistics Authority, 2023). It is also one of

the most popular fruit vegetables in the Philippines and is one of the important crops, but

it is susceptible to serious diseases that hinder its production (Martins et al., 2024). In

2020, the annual eggplant production in Western Visayas was ranked fifth in the

Philippines, contributing to 7.05% of the total eggplant yield in the country. Spanning

across 1,605 hectares of cultivated land, the region achieved a remarkable yield of 10,664

kilograms per hectare (Philippine Statistics Authority, 2021).

In the second quarter of 2022, production increased by 0.2% to 106.87 thousand

metric tons compared to the previous year. From 2011 to 2016, production rose from

208,000 metric tons to 235,600 metric tons (Philippine Statistics Authority, 2023).

As we interviewed the head of Agriculturist in the Municipal of San Remigio Mr.

Melicano, on March 7, 2024, valuable insights were gathered regarding identifying and

classifying eggplant diseases. He described eggplant shapes as elongated, cylindrical,

ovoid, slender, or round, with spiny or nonspiny calyxes, and lengths varying from 2″ to

over 12″. Mr. Melicano highlighted key disease indicators, such as leaves showing

disease symptoms versus being reduced by animals like chickens. He identified the

primary causes of diseases as bacteria, viruses, fungi, and nematodes. Regarding worm

infestations, he noted that the impact on eggplant fruit depends on the extent of leaf

1
damage. For disease prevention, he recommended proper crop care practices and the

removal of pests such as worms and insects. Mr. Melicano's insights underscored the

importance of proactive disease management strategies for maintaining optimal eggplant

crop health.

An automated eggplant disease diagnostic system could provide information for the

prevention and control of eggplant diseases. Numerous diseases that affect eggplant

might reduce its output and quality of produce (Kaniyassery et al., 2022). The

unpredictability of illness incidence has risen due to rapidly changing environmental

conditions (Singh et al., 2023). Farmers have traditionally relied on visual inspection to

identify diseases in their eggplant crops. This method, however, is time-consuming,

labor-intensive, and prone to human error. Additionally, early signs of disease can be

difficult to spot with the naked eye, leading to delayed treatment and reduced crop yields

(Arakeri & Lakshmana, 2016).

Recent advancements in computer vision and deep learning offer a more efficient

solution. Mask R-CNN, a type of image recognition technology, has shown great promise

in various object detection tasks, including plant disease identification. This technology

can analyze digital images of eggplant leaves, pinpointing areas that might be diseased.

In this study, the researchers will create a dataset of common healthy leaf and eggplant

leaf diseases such as Cercospora leaf spot, Verticillium wilt, and Early blight, and train

the datasets using the Mask-RCNN for detection and classification.

2
Statement of the Problem
Eggplant cultivation faces significant challenges due to various leaf diseases such as

Cercospora leaf spot, Verticillium wilt, and Early blight. Traditional methods of detecting

diseases in eggplant plants, like looking at the leaves and checking them manually, are

hard work and can lead to mistakes because it depend on people's opinions and can be

affected by human errors. This process takes a lot of time, requires trained individuals,

and is not easy to do on a large scale. Sometimes, different people may see the same leaf

differently, causing inconsistencies in identifying diseases. Mistakes can happen, like

misidentifying a disease or missing early signs of infection that are hard to see.

In connection with that, the purpose of this study is to create a dataset of healthy leaf

and eggplant leaf diseases such as Cercospora Leaf Spot, Verticillium Wilt and Early

Blight and train the dataset using Mask R-CNN for classification and detection of healthy

leaf and with leaf disease.

Objectives

The main objective of the research is to apply MASK R-CNN for the

classification of eggplant leaf diseases. Specifically, this, study aims to achieve the

following:

 To create a new dataset of an eggplant healthy leaf and with diseases such

as Cercospora Leaf Spot, Verticillium Wilt and Early Blight.

 To train a fine-tuned MASK R-CNN model for the classification of healthy

and diseased eggplant leaves.

3
 To evaluate the MASK R-CNN performance in terms of precision, recall

(sensitivity), and F1 score to classify an eggplant healthy leaf and diseases

such as Cercospora Leaf Spot, Verticillium Wilt and Early Blight.

Significance of the Study


This study intends to create the datasets from eggplant leaves and train using the

MASK R-CNN to classify and detect an eggplant healthy leaf and with diseases.

Eggplant Farmers: The result of this study will help the farmers recognize the diseases

present in the eggplant leaves.

Eggplant Vendors - Vendors in the eggplant supply chain will gain indirectly. Vendors

can have access to a more constant and presumably higher-quality supply of eggplants

through enhanced disease management procedures. This can contribute to a stable and

reliable supply of produce for vendors.

Consumers: The result of this study helps consumers benefit indirectly since they now

have access to better and potentially more abundant eggplant produce as a result of

enhanced disease management procedures suggested by the study.

Department of Agriculture - The Department of Agriculture can use the study's

findings to improve their support and advice for eggplant farmers. The use of innovative

technologies like MASK R-CNN in disease diagnosis is consistent with modern

agricultural methods, adding to the overall development of agricultural processes.

Future Researchers: They can expand on the results of this study to investigate new

strategies or enhancements in using MASK R-CNN for disease detection. This research

can help to further the development of creative solutions in imaging and agriculture.

4
Scope and Limitations

This study will focus on creating a dataset on eggplant leaves, and training using the MASK

R-CNN to classify an eggplant healthy leaf and with diseases. The study will train the Mask R-

CNN to determine various types of Eggplant leaf such as healthy and leaf diseases such as

Cercospora leaf spot, Verticillium wilt, and Early blight. The researchers will use the Mask R-

CNN that will only classify diseases included in the dataset. New and unknown diseases may

not be recognized.

Definition of Terms

For a better understanding of this study, the following terms are defined in the context of

this research.

Accuracy- is a crucial metric for evaluating the performance of deep learning models. It

helps to determine the effectiveness of a model and its ability to make correct predictions. In

this study, accuracy is used as one of the measures to test the pre-trained model to classify

eggplant leaf diseases (Simon and Schuster, 2021).

Bacteria- are single-celled microorganisms belonging to the domain Bacteria, capable of

causing diseases in plants. In this study, "bacteria" refers to the presence of pathogens such

Pseudomonas solanacearum in eggplant tissues or soil samples (Madigan et al., 2020).

Bacterial Wilt- is a plant disease caused by bacterial pathogens, leading to wilting and death

of the plant. In this study, bacterial wilt referred to a disease in eggplant plants caused by

bacterial pathogens like Pseudomonas solanacearum (Hayward, 1991).

Classification - is the operation of distributing objects into classes or groups—which are, in

general, less numerous than them. In this study, classification involves defining product

5
attributes in a manner different from the typing method (Internet Encyclopedia of

Philosophy).

Early blight- caused by Alternaria species, is one of the major diseases in the production of

tomato (Solanum lycopersicum), potato (Solanum tuberosum) and other plants, and is most

prevalent on unfertilized or otherwise stressed plants. In this study entails identifying circular

lesions with concentric rings on the leaves of tomato and potato plants, confirming the

presence of Alternaria species as the causal agent (Jindo et al., 2021).

Eggplant leaves- Refer to the foliage of the eggplant (Solanum melongena), which is a

member of the nightshade family. In this study, eggplant leaves will be used as the target

object of detection and classification. The MASK-RCNN will be trained to identify these

leaves within the image (Jindo et al., 2021).

Epochs- defines the number times that the learning algorithm will work through the entire

training of dataset. In this study, the researchers will examine how the number of epochs

affects the performance of machine learning models (Brownlee, 2022).

F1 Score- is a standard metric in various machine learning and data analysis fields. This

study is the harmonic mean of precision and recall (Van Rijsbergen, 1979).

Fungi- are eukaryotic organisms, including molds, mildews, and mushrooms, capable of

causing plant diseases. This study deals with fungal pathogens such as Cerocospora

melangina and Verticillium dahliae, which cause damage eggplant leaves (Alexopoulos et al.,

1996).

Healthy Leaf- A healthy leaf exhibits no signs of disease, damage, or physiological

abnormalities. In this study, a healthy leaf meets predefined visual criteria such as consistent

6
green color, uniform texture, and absence of visible disease symptoms, as identified by image

analysis systems (Taiz & Zeiger, 2010).

Image Annotation - is the process of labeling or tagging images with metadata to describe

the objects or regions within the images. Mask R-CNN is utilized in this study to accurately

detect and classify eggplant leaves within the images as developed on top of Faster R-CNN, a

Region-Based Convolutional Neural Network (Deruyver et al., 2009).

Leaf Disease- is a kind of phenomenon to the natural growth of a plant which is not only

generated hurdles in agribusiness but is also responsible for hampering the agricultural

production of a country. In this study, "Leaf disease" refers to any pathological condition that

affects the leaves of a plant, disrupting its natural growth process (Sarkar et al., 2023).

Leaf Spot- is a plant disease characterized by the presence of spots on the leaves caused by

fungal or bacterial pathogens. This study defines leaf spot as the obvious indications of

fungal or bacterial infections on eggplant leaves, which appear as separate spots or lesions

(Agrios, 2005).

Mask R-CNN - is a state-of-the-art convolutional neural network (CNN) in terms of image

segmentation and instance segmentation. Mask R-CNN was developed on top of Faster R-

CNN, a Region-Based Convolutional Neural Network. Developed on top of Faster R-CNN, a

Region-Based Convolutional Neural Network, Mask R-CNN is utilized in this study to detect

and classify eggplant leaves within the images accurately (YOO et al., 2022).

Nematodes- are roundworms belonging to the phylum Nematoda, capable of parasitizing

plant roots and causing diseases. This research deals with parasitic worms such as root-knot

nematodes which damage eggplant roots (Perry et al., 2009).

7
Overfitting- a modeling error in statistics that occurs when a function is too closely aligned

to a limited set of data points. In this study, the researchers will look at how overfitting

affects the accuracy and reliability of machine learning models (Twin, 2021).

Precision - refers to the accuracy of the annotations made by a model or annotator. It

measures the proportion of correctly annotated objects or regions compared to all objects or

regions annotated as positive (i.e., belonging to a specific class or category). In the context of

this study, precision is evaluated to assess the accuracy of the MASK-RCNN model in

correctly identifying and classifying eggplant leaves (Olusanya et al., 2022).

Recall - measures the completeness of annotations. In image annotation, recall indicates the

proportion of correctly annotated objects or regions compared to all actual objects or regions

belonging to the class of interest in the image. In this study, recall is used to evaluate the

comprehensiveness of the MASK R-CNN model in detecting all instances of eggplant leaves

within the images (Aleev et al., 2012).

Utilizing - refers to the act of making practical or effective use of something, typically to

achieve a specific purpose or goal. In this study, utilizing can be understood as the process of

employing or applying resources, tools, or techniques in a manner that maximizes their

efficiency and effectiveness toward accomplishing a desired outcome or objective (Oxford

University Press, 2021).

Viruses- are infectious agents consisting of genetic material (DNA or RNA) enclosed in a

protein coat, capable of causing diseases in plants. This study focuses on mosaic, a disease

caused by viruses that causes different browning or mottling patterns on eggplant leaves

(Knipe & Howley, 2013).

8
CHAPTER II

REVIEW OF RELATED LITERATURE

In this section, the researchers will review literature and studies on utilizing computer

vision and deep learning for agricultural disease detection, particularly a focus on plant disease

classification using Mask R-CNN.

Machine Learning and Deep Learning

Panigrahi et al. (2020) cited that plant diseases significantly reduce agricultural

productivity, posing challenges for farmers in detection and control. Early disease detection is

crucial to prevent further losses. Research has focused on using supervised machine learning

techniques for this purpose, particularly for maize plant disease detection through plant

images. Studies have analyzed and compared methods such as Naive Bayes (NB), Decision

Tree (DT), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest

(RF). Among these, the RF algorithm has shown the highest accuracy at 79.23%, making it the

most effective model for disease prediction. These trained models aim to assist farmers in the

early detection and classification of plant diseases, offering a preventive approach to managing

crop health.

Wani et al. (2021) also cited that plant disease detection is vital for productive agriculture

and a robust economy. Traditional methods for detecting plant diseases are labor-intensive,

time-consuming, and require specialized expertise. Recently, automatic plant disease detection

has emerged as a significant research area, attracting considerable attention from academics,

9
researchers, and practitioners. Machine Learning (ML) and Deep Learning (DL) techniques

offer promising solutions for early disease detection on plant leaves.

This comprehensive review explores the potential of ML models in identifying plant

diseases. The study focuses on diseases and infections affecting four crop types: Tomato,

Rice, Potato, and Apple. It begins with an examination of the various infections and diseases

associated with these crops, detailing their causes and symptoms. The review then delves into

the steps involved in plant disease detection and classification using ML and DL. The

availability of online datasets for plant disease detection is also discussed, alongside an

evaluation of different ML and DL classification models proposed by researchers worldwide.

These evaluations consider performance metrics, datasets used, and feature extraction

methods. Finally, the review identifies challenges in using ML and DL for plant disease

detection and outlines future research directions in this field.

Deep learning learns the image features and extracts contextual details and global features

that will help in reducing the error remarkably. A deep learning architecture, based on

convolutional neural network (CNN) was used to identify vegetable with minimum color

distortions. For classifying the quality level, CNN select the region (Bhargava & Bansal,

2021).

Computer Vision

According to the study of Sabzi et al. (2020) a computer vision system has been proposed

for the automatic recognition and classification of five varieties of plant leaves under

controlled laboratory imaging conditions. The varieties include: 1) Cydonia oblonga

(quince), 2) Eucalyptus camaldulensis dehn (river red gum), 3) Malus pumila (apple), 4)

10
Pistacia atlantica (Mt. Atlas mastic tree), and 5) Prunus armeniaca (apricot). A total of 516

images of tree leaves were captured, and 285 features were computed for each image. These

features included shape, color, texture based on the gray level co-occurrence matrix,

histogram-based texture descriptors, and moment invariants.

Seven discriminant features were selected for classification using three classifiers: hybrid

artificial neural network–ant bee colony (ANN–ABC), hybrid artificial neural network–

biogeography-based optimization (ANN–BBO), and Fisher linear discriminant analysis

(LDA). The mean correct classification rates (CCR) achieved were 94.04% for hybrid ANN–

ABC, 89.23% for hybrid ANN–BBO, and 93.99% for LDA.

Additionally, the best classifier performance metrics for mean area under the curve (AUC),

mean sensitivity, and mean specificity for the five tree varieties were as follows:

1. Cydonia oblonga (quince): AUC = 0.991 (ANN–ABC), Sensitivity = 95.89% (ANN–

ABC), Specificity = 95.91% (ANN–ABC)

2. Eucalyptus camaldulensis dehn (river red gum): AUC = 1.00 (LDA), Sensitivity = 100%

(LDA), Specificity = 100% (LDA)

3. Malus pumila (apple): AUC = 0.996 (LDA), Sensitivity = 96.63% (LDA), Specificity =

94.99% (LDA)

4. Pistacia atlantica (Mt. Atlas mastic tree): AUC = 0.979 (LDA), Sensitivity = 91.71%

(LDA), Specificity = 82.57% (LDA)

5. Prunus armeniaca (apricot): AUC = 0.994 (LDA), Sensitivity = 88.67% (LDA), Specificity

= 94.65% (LDA)

11
This study demonstrates the effectiveness of these classifiers in accurately identifying and

classifying various plant leaves, with LDA showing particularly strong performance for

several of the varieties.

Mask R-CNN: An Overview

Mask-RCNN is a deep learning algorithm for object detection and instance segmentation,

building upon Faster-RCNN by predicting both bounding boxes and precise segmentation

masks for each object. Its key innovations include the ROIAlign technique for accurate

spatial information and the addition of a mask head branch for fine-grained pixel-level

segmentation.

Figure 1: Mask R-CNN Overview

The Feature Pyramid Network (FPN) shown in Figure 1 plays a crucial role in Mask

R-CNN's feature extraction. FPN constructs a multi-scale feature pyramid that incorporates

information from different image scales. This allows the model to gain a more

comprehensive understanding of object context, facilitating superior object detection and,

12
particularly important for Mask R-CNN, instance segmentation across a wide range of object

sizes.

Mask R-CNN Architecture

Mask R-CNN was proposed by Kaiming He et al. in 2017. It is very similar to Faster R-

CNN except there is another layer to predict segmented. The stage of region proposal

generation is the same in both the architecture; the second stage, which works in parallel,

predicts the class generates a bounding box as well as outputs a binary mask for each Roi.

The architecture of Mask R-CNN is built upon the Faster R-CNN architecture, with the

addition of an extra "mask head" branch for pixel-wise segmentation. The overall

architecture can be divided into several key components shown in figure 2:

Figure 2: Mask R-CNN Architecture

13
Backbone Network

Mask R-CNN leverages a pre-trained Convolutional Neural Network (CNN) like ResNet

or ResNeXt as its backbone. This powerful component extracts high-level features from the

input image, capturing essential characteristics of objects present. To address the challenge of

objects with varying sizes, a Feature Pyramid Network (FPN) is then built on top of the

backbone.

The FPN tackles the challenge of objects with varying sizes by constructing a multi-scale

feature pyramid. This pyramid cleverly combines features from different levels of the

backbone network. High-resolution features from the backbone provide rich semantic

information, crucial for understanding object categories. Conversely, lower-resolution

features offer more precise spatial details, essential for accurately pinpointing object

boundaries, particularly for smaller objects.

Figure 3: Feature Pyramid Networks (FPN) backbone

14
Figure 3 demonstrates the architecture of a Feature Pyramid Network (FPN). This

network is designed to efficiently process images and detect objects of varying sizes by

extracting features at multiple scales.

The FPN in Mask R-CNN consists of the following steps:

1. Feature Extraction: The backbone network extracts high-level features from the input

image.

2. Feature Fusion: FPN creates connections between different levels of the backbone

network to create a top-down pathway. This top-down pathway combines high-level

semantic information with lower-level feature maps, allowing the model to reuse features at

different scales.

3. Feature Pyramid: The fusion process generates a multi-scale feature pyramid, where each

level of the pyramid corresponds to different resolutions of features. The top level of the

pyramid contains the highest-resolution features, while the bottom level contains the lowest-

resolution features.

The feature pyramid generated by FPN enables Mask R-CNN to handle objects of various

sizes effectively. This multi-scale representation allows the model to capture contextual

information and accurately detect objects at different scales within the image.

Region Proposal Network (RPN)

The Region Proposal Network (RPN) is a crucial component inherited from Faster R-

CNN. It analyzes the feature map produced by the backbone network and proposes potential

regions of interest (candidate bounding boxes) that might contain objects within the image.

15
Additionally, the RPN goes beyond just proposing boxes. It also predicts the probability of

each proposed region belonging to a specific object class.

Figure 4: Region Proposal Network

ROI Align

After the RPN generates region proposals, the ROIAlign (Region of Interest Align) layer

is introduced. This step helps to overcome the misalignment issue in ROI pooling.

ROIAlign plays a crucial role in accurately extracting features from the input feature map for

each region proposal, ensuring precise pixel-wise segmentation in instance segmentation

tasks. The primary purpose of ROIAlign is to align the features within a region of interest

(ROI) with the spatial grid of the output feature map. This alignment is crucial to prevent

information loss that can occur when quantizing the ROI's spatial coordinates to the nearest

integer (as done in ROI pooling).

16
Figure 5: ROI Align Operation

Figure 5 illustrates the Region of Interest (RoI) Align operation, a crucial component in

object detection and instance segmentation tasks.

The ROIAlign process involves the following steps:

1. Input Feature Map: The process begins with the input feature map, which is typically

obtained from the backbone network. This feature map contains high-level semantic

information about the entire image.

2. Region Proposals: The Region Proposal Network (RPN) generates region proposals

(candidate bounding boxes) that might contain objects of interest within the image.

3. Dividing into Grids: Each region proposal is divided into a fixed number of equal-sized

spatial bins or grids. These grids are used to extract features from the input feature map

corresponding to the region of interest.

4. Bilinear Interpolation: Unlike ROI pooling, which quantizes the spatial coordinates of

the grids to the nearest integer, ROIAlign uses bilinear interpolation to calculate the pooling

17
contributions for each grid. This interpolation ensures a more precise alignment of the

features within the ROI.

5. Output Features: The features obtained from the input feature map, aligned with each

grid in the output feature map, are used as the representative features for each region

proposal. These aligned features capture fine-grained spatial information, which is crucial for

accurate segmentation.

By using bilinear interpolation during the pooling process, ROIAlign significantly

improves the accuracy of feature extraction for each region proposal, mitigating

misalignment issues. This precise alignment enables Mask R-CNN to generate more accurate

segmentation masks, especially for small objects or regions that require fine details to be

preserved. As a result, ROIAlign contributes to the strong performance of Mask R-CNN in

instance segmentation tasks.

Mask Head

The Mask Head is an additional branch in Mask R-CNN, responsible for generating

segmentation masks for each region proposal. The head uses the aligned features obtained

through ROIAlign to predict a binary mask for each object, delineating the pixel-wise

boundaries of the instances. The Mask Head is typically composed of several convolutional

layers followed by up sample layers (deconvolution or transposed convolution layers).

18
Figure 6: Mask Head Structure

Figure 6 illustrates the Mask Head structure, a component commonly used in object

detection and instance segmentation tasks.

Related Studies

Eggplant Leaf Disease Classification

Several studies have explored diverse approaches for disease detection in eggplants and

tea leaves, with varying degrees of success.

Aravind et al. (2019) achieved high accuracy (96.7%) using transfer learning models like

VGG16 and AlexNet for *Solanum Melongena* classification, but their model struggled

with illumination and color variations. Maggay et al. (2020) proposed a mobile-based

system for eggplant disease recognition but only achieved 80-85% accuracy, limiting its

practical use. Xie used spectral and texture features with KNN and AdaBoost for detecting

early blight in eggplant leaves, reaching an accuracy of 88.46%, although HLS images

underperformed.

19
Wei et al. employed infrared spectroscopy for identifying seedlings affected by root knot

nematodes with 90% accuracy, but their method was only effective under normal conditions.

Sabrol et al. used GLCM and ANFIS for eggplant disease classification, achieving 98%

accuracy, though constrained by a small dataset. Wu et al. combined visible and near-infrared

reflectance spectroscopy with neural networks for early disease detection in eggplants,

reaching 85% accuracy.

Li et al.'s work stands out for utilizing Mask R-CNN for precise segmentation, wavelet

transform for feature enhancement, and F-RNet for classification, achieving 98.7% detection

accuracy for disease and insect spots in tea leaves, outperforming other models. These

studies highlight the potential of advanced image processing and machine learning methods

for plant disease detection, while also underscoring challenges related to environmental

variability, image types, and dataset limitations.

Eggplant Leaf Diseases

Eggplant leaf diseases are significant constraints in profitable eggplant production, as

highlighted by Herbert Dustin Aumentado and Mark Angelo Balendres. Identifying disease-

causing agents and emerging pathogens is crucial for effective disease management. In

November 2021, a new fungal species, *Diaporthe melongenae* sp. nov., was discovered in

Cavite, Philippines, following an outbreak of leaf blights with a 10%-30% disease incidence,

leading to severe economic losses for farmers. Environmental factors and host characteristics

play a role in the severity of the disease. Besides fungal infections, other major challenges

include fruit and shoot borer infestations, bacterial wilt, irrigation issues, and climate-related

problems. The overuse of pesticides in eggplant farming also raises health and environmental

20
concerns. To address these issues, Local Government Units should prioritize extension

services, research, varietal improvement, and sustainable pest control strategies.

Application of Mask R-CNN Model in identification of Eggplant Flowering Period

The study by Bhattacharya et al. explores the use of the Mask R-CNN model for

identifying eggplant flowering stages, with techniques that can also be applied to classifying

eggplant leaf diseases. To optimize the model, the researchers combined regular and dilated

convolutions and modified the ResNet-50 backbone, improving the model's ability to capture

fine-grained details over larger regions, such as eggplant flowers. These modifications likely

enhanced feature extraction by expanding the receptive field and learning global feature

relationships. Transfer learning with a pre-trained ResNet-50 model was also used to prevent

overfitting, improving training efficiency and generalization to new data. The optimized

model achieved high accuracy, with a mean Average Precision (mAP) of 0.962 and a mean

Intersection over Union (mIOU) of 0.715, indicating strong object detection and

segmentation performance. This research presents a more effective method for identifying

eggplant flowering stages, with potential applications in automating pollination and

improving management practices, offering economic benefits for growers.

21
Recognizing apple leaf diseases using a novel parallel real-time processing framework

based on MASK RCNN and transfer learning: An application for smart agriculture

In the study by Rehman et al., a novel parallel real-time processing framework based on

Mask R-CNN and transfer learning is proposed for recognizing apple leaf diseases, similar to

its application in other plant disease recognition tasks. Techniques like hybrid contrast

stretching for data preprocessing and the use of a pre-trained CNN model for feature

extraction could be adapted to improve input data quality and feature extraction in eggplant

leaf disease detection. The study highlights the importance of automated systems for accurate

disease identification, as manual inspection is often time-consuming and error-prone. By

employing a parallel framework, the system enhances image contrast before using Mask R-

CNN to detect diseased regions. Feature extraction is improved through Kapur’s entropy and

the MSVM method. Tested on the Plant Village dataset, the system achieved an accuracy of

96.6% using the ESDA classifier, outperforming previous methods in apple leaf disease

classification. This approach could be beneficial for real-time processing in smart agriculture

applications, ensuring rapid and accurate disease detection.

22
Mask R-CNN Refitting Strategy for Plant Counting and Sizing in UAV Imagery

In Machefer et al.’s study, Mask R-CNN is applied to plant counting and sizing in UAV

imagery, a different task but still relevant to plant detection and segmentation. The study

emphasizes the benefits of transfer learning, which could reduce the need for labeled data in

eggplant disease detection by using pre-trained Mask R-CNN models. Machefer et al.

combined remote sensing and deep learning to accurately detect and segment individual plants,

such as potato and lettuce, in aerial images. Fine-tuning Mask R-CNN allowed them to

optimize parameters for the task, and the model outperformed traditional computer vision

methods. The performance was evaluated using mean Average Precision (mAP) for

segmentation and Multiple Object Tracking Accuracy (MOTA) for detection, achieving a mAP

of 0.418 for potato plants and 0.660 for lettuces, and a MOTA of 0.781 for potato plants and

0.918 for lettuces. These results demonstrate the model's effectiveness in both detection and

segmentation tasks, suggesting potential adaptations for improving plant disease detection in

similar applications.

Symptom Recognition of Disease and Insect Damage Based on Mask R-CNN, Wavelet

Transform, and F-RNET

Li et al.'s study introduces a robust framework for recognizing tea leaf disease and insect

damage symptoms using Mask R-CNN, wavelet transforms, and F-RNet, which holds

relevance for eggplant leaf disease classification research. The framework utilizes Mask R-

CNN for object detection and segmentation, wavelet transforms for feature enhancement, and

F-RNet for classification. This approach addresses the limitations of manual identification

methods, which suffer from low accuracy and efficiency, offering a more precise and scalable

23
solution. In their study, Mask R-CNN successfully segmented 98.7% of disease and insect

spots from tea leaves, ensuring that nearly all affected areas were captured. The two-

dimensional discrete wavelet transform further refined the image features by generating images

with four different frequency representations, which were then input into the Four-Channeled

Residual Network (F-RNet) for classification. The F-RNet model achieved an accuracy of

88%, outperforming other models like SVM, AlexNet, VGG16, and ResNet18, demonstrating

its effectiveness in both disease and pest recognition. This methodology not only provides a

highly accurate solution for identifying tea leaf diseases like brown blight, target spot, and tea

coal, along with pests such as *Apolygus lucorum*, but also has the potential for adaptation to

eggplant leaf disease detection. By integrating these techniques, researchers could improve the

precision and robustness of eggplant disease classification systems, paving the way for more

effective AI-driven agricultural management strategies.

24
Synthesis

Related research shows that the Mask-RCNN model had a high percentage of

classification accuracy of about 98.7%. Studies also reveal that many methods have been

utilized for classifying and detecting the leaf diseases that are found specifically on

eggplant leaves. However, there is a study about leaf diseases that achieved a correctness

value of 85%. To improve performance of Mask-RCNN in detecting and classifying the

diseases found on eggplant leaves specifically found in the Philippines researchers will

create dataset, annotate the image with labels, pre-process those by resizing and split it

into train, validate and test the Mask-RCNN model. Three types of diseases found on

eggplant leaves in the experiment of the datasets are Cercospora leaf spot, Verticillium

wilt, and Early Blight.

Table 1: This table compares several approaches used in plant disease detection and

classification using various machine learning and deep learning techniques.

Study/Technique Algorithm/Model Accuracy/ Dataset/Focus

Performance

Bhattacharya et al. Mask R-CNN w/ mAP:0.962, Eggplant Flower


ResNet-50 mIOU:0.715 Detection
Rehman et al. Mask R-CNN + Accuracy: 96.6% Apple Leaf Disease
Transfer Learning Detection
Machefer et al. Mask R-CNN + mAP: 0.418 (potatoes), Plant Counting and
UAV Imagery 0.660 (lettuce) Sizing with UAV
Imagery
Li et al. Mask R-CNN + Accuracy: 98.7% for Tea Leaf Disease and
Wavelet segmentation Insect Damage Detection
Transform +F-
RNet
Aravind et al. VGG16, AlexNet Accuracy: 96.7% Eggplant Leaf
(Transfer Classification

25
Learning)

26
Conceptual Framework

27
Figure 7: Conceptual Framework

28
Conceptual Framework

Figure 7 explains the conceptual framework outlines the steps involving creating dataset of

eggplant leaf. Input dataset images of eggplant Pre-processed the images and annotate them,

later on divide the dataset into two subsets: training set, and validation set using Mask R-CNN.

The training set is used to train the model, the validation set helps the model of

hyperparameters (epoch, Learning rates). Assess the performance of the trained model using

the validation set. Calculate metrics such as precision, recall (sensitivity), and F1_score to

measure the model's effectiveness. If the model's performance is not satisfactory, the

researchers may need to revisit previous steps. Consider adding more diverse data, adjusting

model architecture, or optimizing hyperparameters to improve classification accuracy.

29
CHAPTER III

METHODOLOGY

Research Method

In this study, the researcher’s main objective was to create a dataset of eggplant leaf

diseases and apply the fine-tuned Mask-RCNN model for classifying healthy leaf and leaf

disease leaves. The researcher will have a diverse collection of 4, 000 eggplant leaf

images displaying common leaf diseases such as Cercospora Leaf Spot, Verticillium wilt,

and Early Blight including healthy leaf. Each image will be thoroughly undergone pre-

processing and annotation by labeling to highlight regions of interest that correspond to

diseased areas, assuring correct disease classification detection.

In this study, different process such as image collection, image pre-processing which

removing background and noises are included, and the Mask R-CNN model will be used

for classification. Throughout the collection of image dataset, samples of eggplant leaf

with each class have been photograph and separated. Second, image pre-processing will

use to remove background and noises and resize images and enhance the Mask R-CNN

model’s performance trained on data. And lastly, evaluating the Mask R-CNN

performance in terms of precision, recall (sensitivity), and F1_score to classify a healthy

eggplant leaf and diseases.

Additionally, the study utilized Google Colab as the platform for model training and

validation, employing the configurations of learning rate and epochs to assess the

performance of the Mask-RCNN model in classifying the aforementioned diseases.

30
Table 2: Sample Images of Eggplant Leaf Diseases

Healthy Early blight Cercospora leaf spot Verticillium wilt

A leaf in good Concentric rings or Small purple/brown A soil-borne fungal


condition is vibrant bull’s-eye patterns on spots on leaves, with disease leading to
green in color, with a leaves, dark lesions lesions growing larger wilting, yellowing,
smooth texture, with yellow halos, leaf and causing defoliation browning, and necrosis
showing no signs of yellowing and (Cercospora Leaf Spot, of leaves. Vascular
discoloration, spots, or defoliation (Florida 2019). discoloration is visible
damage (University of Plant Disease in cross-sectioned stems
Florida, Institute of Management Guide: (University of
Food and Agricultural Eggplant, 2021). Wisconsin-Madison,
Sciences, 2021). Division of Extension,
2019).

Table 2 provides visual examples of eggplant leaves affected by different diseases. The

images showcase the distinct characteristics associated with each disease, such as color

changes, leaf spots, and wilting.

Data Collection Procedure

The researchers will gather a minimum of 1,000 images for each category of eggplant

leaf: Cercospora Leaf Spot, Verticillium Wilt, Early Blight, and Healthy. Sample images

were acquired manually indoors by taking photos using a VivoY100 mobile phone

equipped with 12– 12-megapixel camera with white background. The images were in

31
rotation while taking a picture with natural light and no filter. A total of 1,000 images of

each leaf diseases and healthy leaf were acquired and saved to folders in JPG format.

Before training the Mask R-CNN model, image pre-processing, annotation and collected

images split in train, and validate will be applied. During the training of the Mask R-

CNN, the learning rate and epochs will be adjusted to analyze the performance results

based on precision, recall, and F1 score.

Figure 8. Camera control setting during image acquisition

Figure 8 includes three images that likely represent different stages of capturing

eggplant leaf disease samples. These images may showcase varying angles, lighting

conditions, or close-ups of specific disease symptoms.

Table 3. The camera control settings used during image acquisition.

Variable Settings

Image Size 4080 x 2296

Flash No flash

Image Type JPG

32
Table 3 outlines the specific camera settings used during the image acquisition

process. The images were captured at a resolution of 4080 x 2296 pixels, with the flash

turned off. The images were saved in JPEG format, a widely used image file type that

offers a balance between image quality and file size.

Data Pre-processing

During training the MASK R-CNN model, the researchers need to get the images

ready for analysis. The researchers will resize them to a standard size, usually around

244x244 pixels, so they are all the same and easier for the model to handle. The

researchers are going to have an output for their pre-processed image on this such as

resize. Another important part is data augmentation, where the researchers make some

slight changes to their images, like rotating or flipping them, to give our model more

examples to learn from. The researchers are going to have an output for their pre-

processed image which is resized as 244*244 pixel on this. Finally, the researchers will

need to make sure their annotations, which mark out the diseased areas on the leaves,

match up perfectly with their preprocessed images.

Input Image Resize Image Data Annotation Pre-


processed
Output

Figure 9: Data Pre-processing

Figure 9 illustrates the data pre-processing pipeline. The process begins with an input

image, which is then resized to a specific dimension (244*244). Subsequently, data

33
annotation is performed to label or mark relevant objects or regions within the image. It

involves labeling specific areas of an image to identify symptoms.


Table 4: Sample Images of Pre-processed Images
Original Pre-processed
Healthy

Cercospora leaf spot

Early blight

Verticillium wilt

Finally, the pre-processed output, which is a modified version of the input image, is

generated.

Table 4 compares original images of eggplant leaves with their corresponding pre-
processed versions.

34
Data Splitting

In this study, the researchers divided the collected images into training, and validation.

Training set is for the model to extract features and learn weights. The researchers took 80%

of the data for training. It helps prevent overfitting and ensures that the model generalizes well

to unseen data. This portion is used to train the model on examples of eggplant leaf diseases,

allowing it to learn to recognize patterns and features associated with different diseases. There

are a total of 2,800 sample images for training.

Validation set is for evaluating the model under training and is the initiator of changing the

model parameters to enhance performance. The researchers use 30% of the data for validation.

The model's performance metrics, such as score, precision, and recall, are computed based on

its predictions on the validation set. There are a total of 1,200 sample images for validation.

Mask R-CNN

The researchers used Mask R-CNN to classify, and segment healthy and diseased areas on

eggplant leaves. It extends a model called Faster R-CNN by adding a feature to create masks

that show the boundaries of each object.

The model starts with a backbone network like ResNet, which extracts important features

from the images. These features are then used by a Region Proposal Network (RPN) to suggest

possible regions where objects (diseased or healthy leaf areas) might be located. These regions

are aligned accurately with the original image using a process called RoI Align.

For each region, the model predicts a bounding box (the area around the object) and

classifies it as either healthy or one of the disease types (Cercospora Leaf Spot, Verticillium

Wilt, Early Blight).

35
Research Instrument

The confusion matrix is a tool used to evaluate the performance of the model, particularly in

terms of its classification. It provides a summary of the prediction results on a classification

problem. It shows the ways in which the researcher’s classification model is confused when it

makes predictions. It shows four types of results when the model makes predictions. The matrix

is structured as follows:

Table 5: Confusion Matrix

Predicted Positive Predicted Negative

Actual Positive True Positive (TP) False Negative (FN)

Actual Negative False Positive (TP) True Negative (TN)

Table 5 is a representation of the performance of a classification model, specifically in

the context of eggplant leaf disease classification. It helps to assess how accurately the

model predicted the correct disease category for each sample.

The confusion matrix is in the form of a square matrix where the column represents the actual

values, and the row depicts the predicted value of the model and vice versa.

TP: True Positive: The actual value was positive, and the model predicted a positive value

FP: False Positive: Your prediction is positive and false. (Also known as the Type 1 error)

FN: False Negative: Your prediction and the result are also false. (Also known as the Type 2

error)

TN: True Negative: The actual value was negative, and the model predicted a negative value

36
The study used the following performance metrics: precision, sensitivity, and F1_score.

Based on the article of Karimi (2021), the following formula is used to calculate the metrics:

Precision

Measures how many of the positive predictions made by the model were actually correct.

In other words, it assesses the model's ability to avoid false positives. A high precision

indicates that the model is good at identifying only relevant instances (Géron, A. 2019). The

formula for calculating the precision is given below:

Precision = TP / (TP + FP)

Recall (sensitivity)

Measures how many of the actual positive instances the model was able to correctly

identify. It assesses the model's ability to avoid false negatives. A high recall indicates that the

model is good at capturing most of the positive instances in the dataset (Géron, A. 2019). The

formula for calculating the recall is given below:

Recall = TP / (TP + FN)

37
F1 Score

is a harmonic mean of precision and recall, providing a single metric that balances the

importance of both. It is particularly useful when there is an imbalance between the positive

and negative classes in the dataset. A high F1-score indicates that the model is performing

well in terms of both precision and recall (Géron, A. 2019). The formula for calculating the

F1 score is given below:

F1-Score = 2 * (Precision * Recall) / (Precision + Recall)

Data Analysis Procedure

Data analysis is a systematic process involving data collection, cleaning, exploration,

statistical analysis, modeling, and interpretation. By following these steps and employing

various techniques, analysts can uncover valuable insights and information from data,

enabling informed decision-making in various fields.

High Recall: Recall values ranging from 0.91 to 1.0 indicate a high level of recall, signifying

the model's ability to accurately identify the majority of true negative instances.

Moderate Recall: Recall values falling between 0.71 and 0.9 suggest moderate recall, showing

that the model is reasonably effective at recognizing negative instances.

Moderately Low Recall: Recall values between 0.51 and 0.7 indicate moderately low recall,

suggesting that the model has some effectiveness in identifying negative instances.

38
Low Recall: Recall values below 0.5 represent low recall, indicating that the model is

misclassifying a significant number of true negative instances. (Sklearn.metrics). The

summary of the interpretation of Recall (sensitivity) is presented in Table 6.

Table 6: Recall (Sensitivity) Interpretation of Results

Recall Rate Interpretation

This range indicates a high level of recall, here the model


0.90 –1.0 is effective in correctly identifying the majority of true
negative instances.
Falling within this range suggests moderate recall,
0.71 - 0.9 indicating that the model is reasonably effective in
identifying negative instances.
Values in this range suggest moderately low recall,
0.51- 0.7 indicating that the model is somewhat effective in
identifying negative instances.
This range signifies low recall, where the model is
0.5 Below misclassifying a significant number of true negative
instances

High Precision: A value ranging from 0.91 to 1.0 signifies high precision, indicating that the

model accurately predicts the majority of positive instances.

Moderate Precision: Falling between 0.71 and 0.9 a value in this range suggests moderate

precision, showing that the model is reasonably accurate in predicting positive instances.

Moderately Low Precision: A value between 0.51 and 0.7 indicates moderately low

precision, suggesting that the model is somewhat effective in identifying positive instances.

39
Low Precision: A value below 0.5 or below represents low precision, indicating that a

significant number of positive predictions made by the model are incorrect

(Sklearn.metrics). The summary of the interpretation of precision is presented in Table 7.

Table 7: Precision Interpretation of Results

Precision Rate Interpretation


Indicating that the model effectively identifies the
0.90 –1.0 majority of actual positive instances.

Suggesting that the model is reasonably effective in


0.71 – 0.9 identifying positive instances.

Indicating that the model is somewhat effective in


00.51 – 0.7 identifying positive instances.
Implying that the model is misclassifying a significant
0.5 Below number of actual positive instances.

High F1_score: A value between 0.91 to 1.0 indicates high model’s performance is

outstanding, achieving a high balance between precision and recall.

Moderate F1 score: A value between 0.71 and 0.9 suggests moderate precision, indicating the

model’s performance is solid, indicating a good balance between precision and recall.

Moderately Low F1-score: A value between 0.51 and 0.7 this suggests the model’s

performance is moderate, achieving a reasonable balance between precision and recall.

Low F1-score: A value below 0.5 indicates low the model’s performance is poor, showing

some ability to balance precision and recall but still not reliable. (Sklearn.metrics). Table 8

below shows the summary of interpretation of F1_score.

40
F1_score Rate F1_score Rate Interpretation

0.91 – 1.0 The model’s performance is outstanding, achieving a high


balance between precision and recall. This suggest that the
model is robust in distinguishing between positive and
negative classes.
0.71 – 0.9 The model’s performance is solid, indicating a good balance
between precision and recall.
00.51 – 0.7 The model’s performance is moderate, achieving a reasonable
balance between precision and recall.

0.5 Below The model’s performance is poor, showing some ability to


balance precision and recall but still not reliable.
Table 8: F1-Score Interpretation of Results

Data Annotation

The researchers will annotate the image of an eggplant leaf with diseases and a healthy

eggplant leaf. For each image, a ground truth labeled image was manually generated

containing the individual segmentation of diseases found in an eggplant leaf in the image. The

image annotation tool to use to produce a ground truth mask is VGG Image Annotator (VIA).

VIA is developed by the Visual Geometry Group at the University of Oxford that supports a

wide variety of annotations. The specific operation was to define all diseases within an

eggplant by marking the area of the diseased leaf using a bounding box and then labeling the

leaf according to its names (Healthy, Early blight, Cercospora leaf spot, and Verticillium wilt).

The annotations obtained were saved on json format with their corresponding images to be

used in the instance segmentation class.

Ethical Considerations

The researchers used their own image dataset, no copyright infringement took place. The

researchers addressed several ethical issues during the research process.

41
Hyperparameter Optimization

Hyperparameters are important because they can have a big impact on model training as it

relates to training time, infrastructure resource requirements, model convergence and model

accuracy (Kasture, 2020). Selection of the right machine learning model and the

corresponding correct set of hyperparameter values are very crucial since they are the key for

the model’s overall performance (Kumar, 2021). Optimization of the hyperparameters will be

done through manual search wherein combinations of hyperparameter values will be tested,

and train the model for each combination, then pick the one that gives the best result. Below

are the hyperparameters to be fine-tuned:

Learning rate: the learning rate is a hyperparameter that controls how much to change the

model in response to the estimated error each time the model weights are updated. Choosing a

too small value may result in a long training process that could get stuck, whereas choosing a

too large value may result in learning a sub-optimal set of weights too fast or an unstable

training process (Brownlee, 2019).

Epoch: the epoch indicates the number of passes of the entire training dataset the machine

learning algorithm has completed (Bell, 2020). Too many epochs can lead to overfitting of the

training dataset, whereas too few may result in an under fit model (Brownlee, 2018).

42
CHAPTER IV

RESULTS AND DISCUSSIONS

The researchers have collected a dataset of eggplant leaf by capturing and collecting

images specifically focused on these leaf diseases, by removing the background from images

is a common image processing task. By analyzing the color distribution within the eggplant

leaf images, we categorized the eggplant leaf into four classes: healthy, early blight,

cercospora leaf spot, and verticillium wilt. These pre-processing steps are often necessary for

subsequent analysis or machine learning tasks involving image datasets. The said architecture

was configured on the hyperparameters such as learning rate and epoch. Several values of

learning rate and epoch was tested to select which best for classifying eggplant leaf disease

based on precision, sensitivity, and F1_score. The Mask R-CNN architecture was utilized to

classify the four classes of eggplant leaf disease.

To prepare a dataset for training a model with Mask R-CNN, organize labeled images of

eggplant leaf diseases and healthy leaves into separate folders. Set up Google Colab, upload

the dataset using unzip, and install libraries and frameworks like TensorFlow via pip install.

Utilize appropriate data loading and preprocessing techniques for Mask R-CNN. The sample

dataset can be seen in the appendices.

43
Table 9: Training results of Mask R-CNN architecture based on precision, recall, and F1-
Score of healthy leaf.

Learning Rate Epoch Precision Recall F1-Score


0.01 10 0.71 1.00 0.83
0.001 10 0.32 0.40 0.37
0.0001 10 0.76 0.60 0.26
0.01 20 0.75 1.00 0.86
0.001 20 0.58 0.79 0.67
0.0001 20 0.16 0.73 0.35
0.01 30 0.65 1.00 0.79
0.001 30 0.33 1.00 0.49
0.0001 30 0.25 1.00 0.39

The Mask R-CNN architecture was successfully configured For the learning rate of 0.01,

the model shows strong results across different epochs, with a consistent recall of 1.00,

indicating that it captures all positive cases. Precision ranges from 0.65 to 0.75, which

suggests that while the model predicts most positives correctly, there are some false positives.

As epochs increase, the F1-score improves, peaking at 0.86 for 20 epochs. This learning rate

provides a good balance between learning speed and accuracy, with 20 epochs being the most

optimal setting based on F1-score.

With a learning rate of 0.001, the model requires more epochs to show improvement but still

struggles with lower precision and recall compared to 0.01. Precision starts low and improves

slightly with more epochs, reaching only 0.33 at 30 epochs, while recall hits 1.00 at that point.

This suggests that the model is highly sensitive to detecting positives but also misclassifies a

significant number of non-positives, resulting in a moderate F1-score. The combination of

lower precision and variable performance indicates that this learning rate may be too low for

this dataset within the given epochs.

44
At a learning rate of 0.0001, the model performs poorly overall, with low precision and F1-

scores even as epochs increase. While recall reaches 1.00 by 30 epochs, precision remains

very low, ranging from 0.16 to 0.25, suggesting a significant amount of overfitting with many

false positives. This learning rate is too low to effectively capture patterns in the data within

the 10 to 30 epoch range, making it unsuitable for the current training setup.

Table 10: Training results of Mask R-CNN architecture based on precision, recall, and
F1-score of Cercospora leaf spot.

Learning Rate Epoch Precision Recall F1-Score


0.01 10 0.50 0.45 0.48
0.001 10 0.24 0.22 0.37
0.0001 10 0.60 0.48 0.56
0.01 20 0.50 0.45 0.48
0.001 20 0.33 0.27 0.35
0.0001 20 0.64 0.52 0.50
0.01 30 0.40 0.50 0.44
0.001 30 0.35 0.30 0.40
0.0001 30 0.73 0.57 0.57

The Mask R-CNN architecture with a learning rate of 0.01, the model shows a moderate

and consistent performance across 10, 20, and 30 epochs, with precision, recall, and F1-

scores hovering around 0.45 to 0.5. This steadiness suggests that the model may not be

effectively learning with this rate, as increasing the epochs does not yield noticeable

improvements, indicating possible underfitting where the model fails to capture sufficient

complexity in the data. By contrast, at a learning rate of 0.001, the model performs poorly at

10 and 20 epochs with low precision and recall, resulting in low F1-scores around 0.35.

However, at 30 epochs, recall jumps dramatically to 1.0, but precision remains low at 0.33,

yielding an F1-score of 0.49. This suggests that the model is overfitting by learning too

many positives while misclassifying negatives, making this rate too low to achieve a

45
balanced performance. Finally, at a learning rate of 0.0001, the model shows steady

improvement across epochs, reaching the best balance at 30 epochs with a precision of 0.73,

recall of 0.57, and F1-score of 0.57. This gradual improvement across metrics indicates that

the slower learning rate enables the model to capture data patterns more effectively over

time, allowing for better generalization without overfitting, making it the most effective

learning rate of the three.

Table 11: Training results of Mask R-CNN architecture based on precision, recall, and
F1-score of Early blight class.

Learning Rate Epoch Precisio Recall F1-Score


n
0.01 10 0.92 0.55 0.69
0.001 10 0.39 0.35 0.37
0.0001 10 0.69 0.45 0.57
0.01 20 0.93 0.70 0.80
0.001 20 0.90 0.50 0.60
0.0001 20 0.50 0.05 0.09
0.01 30 1.00 0.32 0.48
0.001 30 0.44 0.21 0.29
0.0001 30 0.94 0.35 0.89

Mark-CNN architecture across different learning rates demonstrate varied performances in

precision, recall, and F1-score when identifying early blight. With a learning rate of 0.01, the

model shows high precision across epochs but varying recall and F1-score results. At 10

epochs, precision is 0.92, recall is 0.55, and F1-score is 0.69, indicating that the model

correctly identifies many true positives but has moderate recall, resulting in an intermediate

F1-score. Increasing to 20 epochs, the precision rises slightly to 0.93, with a notable

improvement in recall to 0.70, yielding a higher F1-score of 0.80. However, at 30 epochs, the

precision reaches a perfect 1.00, but recall drops sharply to 0.32, lowering the F1-score to

46
0.48. This suggests overfitting, where the model confidently predicts fewer instances, thereby

lowering recall.

With a learning rate of 0.001, the model’s performance is generally less consistent and

slightly lower across metrics. At 10 epochs, precision, recall, and F1-score are lower at 0.39,

0.35, and 0.37, respectively, suggesting limited learning. After 20 epochs, precision improves

to 0.90, but recall is still relatively low at 0.50, producing an F1-score of 0.60. By 30 epochs,

precision decreases to 0.44, and recall falls to 0.21, resulting in an F1-score of 0.29, indicating

that the model struggles to balance between true positive predictions and overall sensitivity.

With a smaller learning rate of 0.0001, the model initially exhibits moderate performance but

faces difficulties in retaining consistency as training continues. At 10 epochs, precision is

0.69, recall is 0.45, and the F1-score is 0.57, indicating a relatively balanced but moderate

result. At 20 epochs, performance drops, with precision falling to 0.50 and recall plummeting

to 0.05, resulting in a very low F1-score of 0.09, which suggests ineffective learning. By 30

epochs, the model shows high precision at 0.94, but recall remains low at 0.35, producing a

high F1-score of 0.89, which may indicate some improvements in prediction certainty but a

low balance of correct classifications.

47
Table 12: Training results of Mask R-CNN architecture based on precision, recall, and
F1-Score of Verticillium wilt.

Learning Rate Epoc Precision Recall F1-Score


h
0.01 10 0.83 1.00 0.91
0.001 10 0.80 0.27 0.40
0.0001 10 0.84 0.37 0.53
0.01 20 0.94 1.00 0.97
0.001 20 0.37 1.00 0.54
0.0001 20 0.81 0.40 0.43
0.01 30 0.88 1.00 0.94
0.001 30 0.38 0.30 0.45
0.0001 30 0.95 0.90 0.93

The Mask R-CNN architecture was successfully configured and the values that gives best

result for training eggplant leaf disease, Verticillium wilt class are 20 and 0.01 for epoch and

learning rate, respectively. With a learning rate of 0.01, the model shows strong performance

across all epoch levels. At 10 epochs, it achieves a precision of 0.83 and a recall of 1.00,

resulting in an F1-score of 0.91. This indicates that the model is able to detect all early blight

instances, with some minor inaccuracies. When the training reaches 20 epochs, precision

improves to 0.94 while maintaining a perfect recall, leading to an F1-score of 0.97, which

demonstrates a near-ideal balance between precision and sensitivity. At 30 epochs, precision

slightly decreases to 0.88, but the recall remains at 1.00, resulting in an F1-score of 0.94.

This performance indicates that while increasing the epochs can enhance accuracy, the

improvement starts to plateau, suggesting that further training may not yield significant

gains.

With a learning rate of 0.001, the model experiences difficulties in achieving a high balance

between precision and recall. At 10 epochs, the model has a precision of 0.80 but a recall of

48
only 0.27, yielding an F1-score of 0.40, which suggests that while it avoids false positives, it

misses most early blight cases. As the training progresses to 20 epochs, recall improves to

1.00, but this comes at the cost of precision, which drops to 0.37, resulting in an F1-score of

0.54. This change implies the model becomes overly sensitive and misclassifies many cases

as early blight. At 30 epochs, the precision and recall become more balanced at 0.38 and

0.30, respectively, with an F1-score of 0.45. Overall, at this learning rate, the model struggles

to achieve strong performance in detecting early blight, even with extended training.

At a learning rate of 0.0001, the model requires more epochs to reach higher levels of

accuracy. At 10 epochs, it has a precision of 0.84 and a recall of 0.37, resulting in an F1-

score of 0.53, which shows moderate precision but low sensitivity to detecting early blight

cases. Increasing to 20 epochs, the precision slightly decreases to 0.81, with a recall of 0.40

and an F1-score of 0.43, indicating minimal performance improvement. However, at 30

epochs, the model achieves a significant boost in both precision and recall, reaching 0.95 and

0.90, respectively, and yielding an F1-score of 0.93. This suggests that with enough training,

a low learning rate can eventually yield balanced and high performance, though it requires a

longer training time compared to higher learning rates.

CHAPTER V

CONCLUSION AND RECOMMENDATIONS


Conclusion

The researchers created their own datasets for classifying of an eggplant leaf such as

healthy, and with diseases such as cercospora leaf spot, early blight, and verticillium wilt,

49
that involves collecting, resizing, and annotating a diverse set of images, splitting images

into training, and validation, training a model using machine learning techniques,

specifically, deep learning model, evaluating its performance through hyperparameters and

iterates on the process to achieve a better result.

The Mask R-CNN model demonstrated varying performance across across different leaf

disease classes. For Early Blight, the highest F1-score achieved was 0.89, while Verticillium

wilt, it reached 0.94. However, the model’s performance was sensitive to learning rate and

number of epochs, with suboptimal results observed for certain hyperparameter

combinations. Additionally, trade-off between precision and recall was evident in some

cases. This implies that the said Mask R-CNN model can correctly classify the healthy leaf

and leaf diseases indicated in the study.

The results highlight the effective performance of Mask R-CNN in classifying eggplant

leaves as healthy or diseased. For Cescospora Leaf Spot, Verticilium Wilt, and Early Blight,

the model achieved the F1-scores ranging from 0.37 to 0.94. Precision and recall sensitive to

hyperparameters like learning rate and epochs. While the model showed promising results

for optimization is needed to improve performance for Early Blight and Cercospora Leaf

Spot. To improve Verticillium wilt, further the model’s overall performance and

generalization, further hyperparameter tuning, data augmentation and addressing class

imbalance are recommend.

Recommendations

50
Although this research has shown the potential of Mask R-CNN for detecting eggplant

leaf diseases, there are still opportunities for future studies to improve the model's

abilities and real-world uses.

Initially, it is essential to increase the dataset in order to encompass a broader spectrum

of eggplant leaf diseases to enhance the model's ability to generalize and its accuracy. By

including pictures of uncommon or new diseases, the algorithm can learn to identify a

wider range of risks to eggplant farms.

Additionally, creating a mobile application that is easy for users to navigate could

greatly transform disease detection within agricultural environments. This type of app has

the potential to use smartphone capabilities to snap pictures of damaged leaves, enabling

farmers to get instant diagnosis and treatment advice on their devices. This would enable

farmers to make well-informed choices and act promptly to reduce losses.

Moreover, delving into additional cutting-edge deep learning approaches like YOLOv8

or EfficientDet may enhance the model's accuracy, speed, and efficiency. These cutting-

edge algorithms are recognized for their capability to identify objects instantly, which

makes them perfect for mobile apps. Furthermore, implementing hardware optimization

methods, such as making use of specialized hardware like TPUs or GPUs, could greatly

speed up the inference process, leading to quicker and more effective disease detection.

In summary, this study provides a strong basis for future progress in identifying

diseases in eggplant leaves. By acknowledging the constraints of the present study and

exploring the suggested research paths, we can create stronger and more feasible

solutions to assist farmers in safeguarding their crops and guaranteeing food security

51
52
References

Ansari, A. M., Hasan, W., & Prakash, M. (Eds.). (2021). Solanum Melongena:
Production,Cultivation and Nutrition. Nova Science Publishers.Applied With
Different Levels of Chicken Dung; EnvironmentAsia 13 (Special issue)
(2020)81-86

Aumentado, H. D., & Balendres, M. A. (2022). Characterization of Corynespora


cassiicola causing leaf spot and fruit rot in eggplant (Solanum melongena L.).
Archives of Phytopathology and Plant Protection, 55(11), 1304-1316.

Aumentado, H. D., & Balendres, M. A. (2023). Diaporthe melongenae sp. nov, a new
fungal causing leaf blight in eggplant. Journal of Phytopathology. Advance
online publication. https://doi.org/10.1111/jph.13246

Baquiran, D. D. M., Calica, S. F. C., Constante, K. N. G., Decena, P. V. S., Gabriel, R.


M. T., Gumpad, E. J. D., ... & Tumanguil, M. L. J. Antifungal Property of
Himbabao (Broussonetia luzonica) Leaves Extract Against Phomopsis vexans
Blight on Eggplants (Solanum melongena).

Bhattacharya, S., Banerjee, A., Ray, S., Mandal, S., & Chakraborty, D. (2022,
September). An Advanced Approach to Detect Plant Diseases by the Use of
CNN Based Image Processing. In International Conference on Innovations in
Computer Science and Engineering (pp. 467-478). Singapore: Springer Nature
Singapore.

Chakravarty, A., Jain, A., & Saxena, A. K. (2022, December). Disease Detection of
Plants using Deep Learning Approach—A Review. In 2022 11th International
Conference on System Modeling & Advancement in Research Trends (SMART)
(pp. 1285-1292). IEEE.

Cheng, L., Li, J., Duan, P., & Wang, M. (2021). A small attentional YOLO model for
landslide detection from satellite remote sensing images. Landslides, 18(8),
2751-2765.

Chowdhury, M. E., Rahman, T., Khandakar, A., Ayari, M. A., Khan, A. U., Khan, M.
S., ... & Ali, S. H. M. (2021). Automatic and reliable leaf disease detection using
deep learning techniques. AgriEngineering, 3(2), 294-312.

Dimkić, I., Janakiev, T., Petrović, M., Degrassi, G., & Fira, D. (2022). Plant-
associated Bacillus and Pseudomonas antimicrobial activities in plant disease
suppression via biological control mechanisms-A review. Physiological and
Molecular Plant Pathology, 117, 101754.

Dr. Jayanthi, M. G., & et al. (2022). Eggplant leaf disease detection and
segmentation using adaptively regularized multi-Kernel-Based Fuzzy C-Means

53
and Optimal PNN classifier.Indian Journal of Computer Science and
Engineering (IJCSE), 13(5), Sep-Oct.

Haque, M. R., & Sohel, F. (2022). Deep network with score level fusion and
inference-based transfer learning to recognize leaf blight and fruit rot diseases of
eggplant. Agriculture, 12(8), 1160.

Jindo, K., Evenhuis, A., Kempenaar, C., Sudré, C. P., Zhan, X., Teklu, M.
G., & Kessel, G. (2021, February 27). Review: Holistic pest management
against early blight disease towards sustainable agriculture. Pest Management
Science (Print). https://doi.org/10.1002/ps.6320.

Jin X., Che J., Chen Y., 2021. Weed identification using deep learning and image
processing in vegetable plantation. IEEE Access Volume 9, Issue 1,pp 10940–
10950.https://doi.org/10.1109/ACCESS.2021.305029.

Kaniyassery, A., Goyal, A., Thorat, S. A., Rao, M. R., Chandrashekar, H. K., Murali,
T. S., & Muthusamy, A. Association of Meteorological Variables with Leaf Spot
and Fruit Rot Disease Incidence in Eggplant and Ai-Based Disease
Classification. Available at SSRN 4555881.

Kai, Z. H. E. N. G., Chun, F. A. N. G., Simiao, Y. U. A. N., Chuang, F. E. N. G., &


Guokun, L.I (2022). Application of Mask R-CNN Model in Identification of
Eggplant Flowering Period. Journal of Computer Engineering & Applications,
58(18).

Kaniyassery, A., Thorat, S. A., Kiran, K. R., Murali, T. S., & Muthusamy, A. (2023).
Fungal diseases of eggplant (Solanum melongena L.) and components of the
disease triangle: a review. Journal of Crop Improvement, 37(4), 543-594.

Khan, A. I., & Al-Habsi, S. (2020, January 1). Machine Learning in Computer Vision.
Procedia Computer Science. https://doi.org/10.1016/j.procs.2020.03.355

Kashyap, G. S., Kamani, P., Kanojia, M., Wazir, S., Malik, K., Sehgal, V. K., &
Dhakar, R. (2024). Revolutionizing Agriculture: A Comprehensive Review of
Artificial Intelligence Techniques in Farming.

Li, H., Shi, H., Du, A., Mao, Y., Fan, K., Wang, Y., ... & Ding, Z. (2022). Symptom
recognition of disease and insect damage based on Mask R-CNN, wavelet
transform, and F-RNet. Frontiers in Plant Science, 13, 922797.

Lippi, M., Bonucci, N., Carpio, R. F., Contarini, M., Speranza, S., & Gasparri, A.
(2021, June). A yolo-based pest detection system for precision agriculture. In
2021 29th Mediterranean Conference on Control and Automation (MED) (pp.
342-347). IEEE.

54
Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., ... & Yang, J. (2020). Generalized
focal loss: Learning qualified and distributed bounding boxes for dense object
detection. Advances in Neural Information Processing Systems, 33, 21002-
21012.

Maggay, J. G. (2020). Mobile-Based Eggplant Diseases Recognition System using


Image Processing Techniques. International Journal of Advanced Trends in
Computer Science and Engineering, 9(1.1), 182-190. ISSN 2278-3091. Volume
9, No.1.1, 2020.

Ma, W., Wang, X., Qi, L., & Zhang, D. (2019). Identification of Eggplant Young
Seedlings Infected by Root Knot Nematodes Using Near Infrared Spectroscopy.
In Computer and Computing Technologies in Agriculture X: 10th IFIP WG 5.14
International Conference, CCTA 2016, Dongying, China, October 19–21, 2016,
Proceedings 10 (pp. 93-100). Springer International Publishing.

Matsuzaka, Y., & Yashiro, R. (2023). AI-based computer vision techniques and expert
systems. AI, 4(1), 289-302.

Melicano III, M. (2024, March 7). Eggplant Leaf Diseases. Personal interview.

Nasution, S. W., & Kartika, K. (2022). Eggplant Disease Detection Using Yolo
Algorithm Telegram Notified. International Journal of Engineering, Science
and Information Technology, 2(4), 127-132.

Ngugi, L. C., Abelwahab, M., & Abo-Zahhad, M. (2021). Recent advances in image
processing techniques for automated leaf pest and disease recognition–A review.
Information processing in agriculture, 8(1), 27-51.

Oxford University Press. (2021). Utilize. In Lexico.com. Retrieved from


https://www.lexico.com/definition/utilize

PDMG-V3-39/PG047: 2021 Florida Plant Disease Management Guide: Eggplant.


(n.d.). Ask IFAS - Powered by EDIS. https://edis.ifas.ufl.edu/publication/PG047

Shafique, S., Sahar, S., & Akhtar, N. (2019). First report of


Cladosporium cladosporioides instigating leaf spot of Solanum melongena from
Pakistan. Pakistan Journal of Botany, 51(2), 755-759. DOI: 10.30848/PJB2019-
2(43)

Smith, M. (2019). Mastering APIs: Design and Build High-Quality Web APIs.
O'Reilly Media.
Terven, J., Córdova-Esparza, D. M., & Romero-González, J. A. (2023). A
comprehensive review of yolo architectures in computer vision: From yolov1 to
yolov8 and yolo-nas. Machine Learning and Knowledge Extraction, 5(4), 1680-
1716.

55
Viswanath, K. K., Varakumar, P., Pamuru, R. R., Basha, S. J., Mehta, S., & Rao, A. D.
(2020). Plant lipoxygenases and their role in plant physiology. Journal of Plant
Biology, 63, 83-95.

Wang, Y.; Zheng, J. Real-time face detection based on YOLO. In Proceedings of


2018 1stIEEE International Conference on Knowledge Innovation and Invention
(ICKII), Jeju, Republic of Korea, 23–27 July 2018; pp. 221–224.

Wu, D.; Lv, S.; Jiang, M.; Song, H. Using channel pruning-based YOLO v4 deep
learning algorithm for the real-time and accurate detection of apple flowers in
natural environments. Comput. Electron. Agric. 2020, 178, 105742.

Wu, D., Feng, L., Zhang, C., & He, Y. (2008). Early detection of Botrytis cinerea on
eggplant leaves based on visible and near-infrared spectroscopy. Transactions of
the ASABE, 51(3), 1133-1139.

Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020, April). Distance-IoU
loss: Faster and better learning for bounding box regression. In Proceedings of
the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 12993-
13000).

Machefer, M., Lemarchand, F., Bonnefond, V., Hitchins, A., & Sidiropoulos, P.
(2020). Mask R-CNN refitting strategy for plant counting and sizing in UAV
imagery. Remote Sensing, 12(18), 3015.

Rehman, Z. U., Khan, M. A., Ahmed, F., Damaševičius, R., Naqvi, S. R., Nisar, W., &
Javed, K. (2021). Recognizing apple leaf diseases using a novel parallel real‐
time processing framework based on MASK RCNN and transfer learning: An
application for smart agriculture. IET Image Processing, 15(10), 2157-2168.

56
APPENDICES

Appendix A.

Eggplant Leaf Resizing

57
Appendix B.

Eggplant Leaf Dataset

Healthy

Cercospora Leaf Spot

58
Early Blight

Verticillium wilt

59
Appendix C.

Validation result of Mask R-CNN at 0.01, 0.001, and 0.0001


Learning Rate in 10, 20 and 30 Epochs

Validation result of Mask R-CNN at 0.01 Learning Rate and 10 Epochs

60
Validation result of Mask R-CNN at 0.001 Learning Rate and 10 Epochs

61
Validation result of Mask R-CNN at 0.0001 Learning Rate and 10 Epochs

62
Validation result of Mask R-CNN at 0.01 Learning Rate and 20 Epochs

63
Validation result of Mask R-CNN at 0.001 Learning Rate and 20 Epochs

64
Validation result of Mask R-CNN at 0.0001 Learning Rate and 20 Epochs

Validation result of Mask R-CNN at 0.01 Learning Rate and 30 Epochs

65
Validation result of Mask R-CNN at Learning Rate 0.001 and 30 Epochs

Validation result of Mask R-CNN at 0.0001 Learning Rate and 30 Epochs

66
67
Appendix D. Source Code

Testing by Uploading Image

import os
import cv2
import numpy as np
import tkinter as tk
from tkinter import filedialog
import matplotlib.pyplot as plt

# Import ONNX dependencies


import onnx # Import the onnx module
import onnxruntime as ort # Import the ONNX Runtime

import torch
from torchvision.ops import nms
from torchvision.transforms import v2 as T
from torchvision.utils import draw_bounding_boxes, draw_segmentation_masks

onnx_file_path = "./checkpoint.onnx"
# Load the model and create an InferenceSession
session = ort.InferenceSession(onnx_file_path, providers=['CPUExecutionProvider'])

label_dict = [ "background", "Healthy", "Cercospora_leafspot", "Verticillium_wilt",


"Early_blight",]

def get_transform(train):
transforms = []
if train:
transforms.append(T.RandomHorizontalFlip(0.5))
transforms.append(T.ToDtype(torch.float, scale=True))
transforms.append(T.ToPureTensor())
return T.Compose(transforms)

def open_img():
x = openfilename()

image_input = cv2.imread(x)
image_input = cv2.resize(image_input,(512, 512))

eval_transform = get_transform(train=False)

input_tensor_np = np.array(image_input, dtype=np.float32).transpose((2, 0, 1))[None]/255

# Run inference
model_output = session.run(None, {"input": input_tensor_np})

image = torch.Tensor(image_input.transpose(2, 0, 1))

68
scores = torch.Tensor(model_output[2])
idx = torch.argmax(scores)

print(model_output[2])
print(model_output[1])
pred_boxes = torch.Tensor(np.array([model_output[0][idx]]))
labels = torch.Tensor([model_output[1][idx]]).to(torch.uint8)
scores = torch.Tensor([model_output[2][idx]])
masks = torch.Tensor(model_output[3][idx])

pred_labels = ["{0} {1:.3f}".format(label_dict[labels[0]], scores[0])]

image = (255.0 * (image - image.min()) / (image.max() - image.min())).to(torch.uint8)


image = image[:3, ...]

output_image = draw_bounding_boxes(image, pred_boxes, pred_labels, colors="red")

masks = (masks > 0.7).squeeze(1)


output_image = draw_segmentation_masks(output_image, masks, alpha=0.5, colors="blue")

plt.figure(figsize=(8, 8))
plt.imshow(output_image.permute(1, 2, 0))
plt.show()

def openfilename():
file = filedialog.askopenfilename(title ='"pen')
if file:
filepath = os.path.abspath(file)
return str(filepath)

if __name__ == "__main__":
root = tk.Tk()
root.title("Image Loader")
root.geometry('400x200')
root.area = [(20, 400), (10, 480), (1020, 480), (1020, 400)]
root.resizable(width = 100, height = 100)
btn = tk.Button(root, text ='Open Image', command = open_img).grid( row = 9, columnspan = 9)
root.mainloop()

69
Appendix E. Documentary of Eggplant leaf

70
Appendix F. Curriculum Vitae

CABRILLOS, HERMIE ROSE G.


Trinidad, San Remigio, Antique
09626744331
[email protected]
PERSONAL DATA
AGE: 23
SEX: Female
BIRTHDAY: August 20, 2001
CIVIL STATUS: Single
NATIONALITY: Filipino
RELIGION: IFI

EDUCATION

Tertiary University of Antique


Mayor Santiago A. Lotilla St., District 1, Sibalom, Antique
2020-Present
Secondary St. Vincent’s High School of San Remegio Incorporated
Poblacion, San Remegio, Antique
2013-2020

Elementary Trinidad Elementary School


Trinidad, San Remigio, Antique
2008-2013

REFERENCES

John Jowil D. Orquia


Faculty
University of Antique
Sibalom, Antique

71
CABRILLOS, SANNY M.
Igcabito-on, Miag-ai, Iloilo
09657191713
[email protected]
PERSONAL DATA
AGE: 24
SEX: Male
BIRTHDAY: January 01,2000
CIVIL STATUS: Single
NATIONALITY: Filipino
RELIGION: Catholic

EDUCATION

Tertiary University of Antique


Mayor Santiago A. Lotilla St., District 1, Sibalom, Antique
2020-Present
Secondary Bacolod, National, High School
Bacolod, Miag-ao, Iloilo
2013-2020

Elementary Dalije Elementary School


Dalije, Miag-ao, Iloilo
2007-2013

REFERENCES

John Jowil D. Orquia


Faculty
University of Antique
Sibalom, Antique

72
FORTIN, JENNY ROSE S.
Trinidad, San Remigio, Antique
09692443028
[email protected]

PERSONAL DATA
AGE: 22
SEX: Female
BIRTHDAY: February 02, 2002
CIVIL STATUS: Single
NATIONALITY: Filipino
RELIGION: Catholic

EDUCATION

Tertiary University of Antique


Mayor Santiago A. Lotilla St., District 1, Sibalom, Antique
2020-Present
Secondary St. Vincent’s High School of San Remegio Incorporated
Poblacion, San Remegio, Antique
2013-2020

Elementary Trinidad Elementary School


Trinidad, San Remigio, Antique
2008-2013

REFERENCES

John Jowil D. Orquia


Faculty
University of Antique
Sibalom, Antique

73

You might also like