0% found this document useful (0 votes)
35 views7 pages

15 Classification of Healthy and Diseased Broccoli Leaves Using A Custom Deep Learning CNN Model

broccoli

Uploaded by

jenifer Jesus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views7 pages

15 Classification of Healthy and Diseased Broccoli Leaves Using A Custom Deep Learning CNN Model

broccoli

Uploaded by

jenifer Jesus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

International Journal of Innovative Research in Computer Science and Technology (IJIRCST)

ISSN (Online): 2347-5552, Volume-12, Issue-5 September 2024


https://doi.org/10.55524/ijircst.2024.12.5.15
Article ID IRP-1565, Pages 110-116
www.ijircst.org

Classification of Healthy and Diseased Broccoli Leaves Using a


Custom Deep Learning CNN Model
Saikat Banerjee1, Soumitra Das2, and Abhoy Chand Mondal3
1
State Aided College Teacher, Department of Computer Applications, Vivekananda Mahavidyalaya, Haripal, Hooghly, West
Bengal, India
2
Research Scholar, Department of Computer Science, The University of Burdwan, Golapbag, West Bengal, India
3
Professor, Department of Computer Science, The University of Burdwan, Golapbag, West Bengal, India
Correspondence should be addressed to Saikat Banerjee;
Received 5 September 2024; Revised 19 September 2024; Accepted 30 September 2024
Copyright © 2024 Made Saikat Banerjee et al. This is an open-access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

ABSTRACT- Agriculture is essential for sustaining the productivity and farmers' business profitability. Broccoli is
global population and is a crucial element in economic among the most grown and marketed crops today due to the
returns and food supply. However, plant leaf diseases are a rapidly increasing nutrient demand in the global market.
major threat to agriculture and economy since they retard For instance, farming broccoli has nutritional and business
yield and increase cost of production. Because of high value to farmers since broccolis are nutritious and have a
demand for broccoli, a wonderful and profitable crop, in ready market.
the market, it has tremendous business opportunities for Nevertheless, things are similar for this plant and many
farmers. Nevertheless, similar to many other food crops, it other crops to which diseases, particularly those of the
is vulnerable to diseases which may affect its production leaves, are always a potential threat. Hemilea and
and quality. Prevent losses from these diseases requires Uettwilian mirids are diseases that, if not diagnosed early,
early detection of the disease affecting the leaves and with can otherwise cut yields dramatically. Detecting diseases
further enhancement of the technology, especially deep mainly involves inspection, which is usually tiresome and
learning. In this research, an application of a new, ambiguous. This supports the fact that different farming
specifically designed CNN model for the differentiation of methods require mechanical means to assist farmers in
healthy and diseased broccoli leaves is proposed. Data noting diseases early on. Recently, deep learning especially
were collected directly from the field using mobile cameras Convolutional Neural Networks has shown great potential
and the images were sorted under healthy and unhealthy to be used in image classification tasks. CNNs are able to
classes respectively. A new CNN model with an localize miniscule features specific to images, which makes
architecture specific to this dataset was designed and it manageable to detect diseases in plant leaves. In this
trained in this project using Keras. As evaluated from the study, we develop a new convolutional neural network
result, the model proved efficient providing an accurate model uniquely trained to diagnose healthy or unhealthy
prediction on the health status of the leaf. The use of deep broccoli leaves using images captured through mobile
learning in disease diagnosis in crops enables farmers to cameras in the field. It is in this regard that our deep
make timely interventions thus protecting their crops, their learning approach can offer a swift and precise solution to
potential economic value and nutritional value. This disease detection; therefore, enabling the farmers to act ‐
research acknowledges the possibilities of applying swiftly and optimally on the yields, with respect to both
advanced technological improvement on the practice of quality and quantity. Following this, the data collection and
agriculture. preparation, proposed CNN model, and outcome of
implementing the model for the classification are discussed
KEYWORDS- Broccoli, Deep Learning, Convolutional in this paper. The demonstrated efficacy of deep learning in
Nural Networks, Leaf Disease Classification, TensorFlow, timely detecting diseases in broccoli farming is one way to
Keras. boost farming productivity and income.

I. INTRODUCTION II. LITERATURE REVIEW


The contribution of agriculture goes beyond the provision For disease detection in leaves, this paper compares new
of food; it is crucial in supporting countries' economies and techniques, which are ML and DL, with the previous
feeding the world. To meet the growing population approaches, such as SVM and Decision Tree, along with
densities of the world, it is necessary to produce quality the latest methods, CNN and RNN. [1] Concerns have been
crops. However, plant diseases are a real problem for identified regarding some usual issues, namely the data
agricultural production, and diseases affecting leaves are quality and demand for computational power; future use of
hazardous since they result in lower yields, reduced quality various data types and utilization of more than one model
and production efficiency, and substantial economic will be discussed. Based on the improved images of the
repercussions. Early identification and control of these plant leaves, this study employs ML mechanisms such as
diseases is therefore critical in maintaining agriculture

Innovative Research Publication 110


International Journal of Innovative Research In Engineering and Management (IJIREM)

SVM, k-NN, and Decision Tree to diagnose leaf gathered. Data collection was therefore conducted in two
diseases[2]. SVM achieved 97% accuracy. Issues include phases to capture dynamism due to seasonal variations. The
data quality, and the next steps will focus on developing first of these phases was conducted in December, during
new methods of creating hybrid models and field testing which we acquired 5000 images. Figure 1 depicts the
models [3]. SVM and feature extraction are used in this Broccoli Field. The second phase was carried out in
study to improve the identification of diseases on leaves, February, and here we collect the remaining 5000 images.
but the work faces challenges with data quality and online Consequently, this temporal division allows us to look at
processing. The plan laid down in the proposed framework the differences in environmental conditions and the
under future work is to use DL models and more extensive influence of those conditions on the health of the broccoli
data sets [4]. In their respective works, Yao et al. discussed leaves.
the performance of traditional ML techniques and CNN
approaches with an essential note for obtaining robust
datasets. Constraints include image quality, size, and
complexity of the model. [5] The present study employs
relatively straightforward image analysis procedures for
identifying the various kinds of diseased leaves, which
yields good results through the SVM and k-means
clustering methods. Limitations on datasets are discussed
explicitly. [6] A highly accurate convolutional neural
network-based model for the detection of diseases in leaves
is needed for significant, diverse inputs. ; Further studies
Figure 1: Broccoli Field
may look into transfer learning or other types of data.[7]
SVM, Random Forest, and Decision Trees are employed to
classify the diseases affecting the leaves. Future work In December 2023, we collected images of fields at the
concerns DL, feature extraction techniques, and data set geographical coordinates 23.9802629 N, 86.9551737 E.
enlargement. [8] The author discusses differences between Most of the leaves collected in this period were healthy, the
ML and DL when applied to plant disease diagnosis, where standard form of healthy leaves. Figure 2 illustrates the
issues such as overfitting, data size, and computational gathering of data in the field. On the other hand, in
requirements are identified. They mentioned that future February, we gathered pictures from fields at the
research should enhance scalability and real-world geographical latitude of 23.83276 and the geographic
application of observing behavior. longitude of 86.90298. This phase mainly provided
unhealthy leaves since environmental conditions at this
III. COMMON LEAF DISEASE OF stage are suitable for disease formation.
BROCCOLI
Alternaria leaf spot: pots are small and dark in color. They
grow, then turn into tiny circles with a diameter of 1mm.
White Rust: White, shiny, raised, blister-like spots or
pustules often found at the underside of the leaves, stem,
and flower. Black rot: First, they appear as chlorotic or
yellow (angular) spots, usually around the edges of the
leaves. The yellow area gets extended to veins and midrib
and forms a ‘V’ shaped chlorotic spot which blackened
later stage. Downy mildew: Small purplish brown spots on
the surface of leaves. On the upper side of the leaves
angular, small, pale yellow spots, and on the underside, it
has down-like growth. The spots join together, and the
leaves roll up and drop off too young.

IV. RESEARCH METHODOLOGY


Broccoli leaf classification presupposes data accumulation, Figure 2: On field data collection
data cleansing, feature extraction, model development,
training the model, assessing the given CNN model, and Apart from that, the methodical approach to selection was
classification with this CNN model. useful for obtaining a balanced collection of data and
A. Data Collection considering various stages of diseases and healthy
conditions necessary for obtaining efficient CNN for
In the interest of constructing a specialized Convolutional
Neural Network (CNN) model that would accurately detect disease detection.
diseases in broccoli leaves, many images of the broccoli
leaves were obtained manually from agricultural fields. It
was important to ensure that the dataset was diverse and of
high quality, which is critical to the training of a robust
model with good generalization capabilities from the
training data set. Altogether, ten thousand images were

Innovative Research Publication 111


International Journal of Innovative Research In Engineering and Management (IJIREM)

B. Data Preprocessing: train. Overall, we could tag 5516 images with the labels of
The following central process was subdivided into two healthy and unhealthy leaves. To further structure the dataset
stages of image preprocessing: initial assessment of the for the training process, we created two main folders: train and
collected data and material preparation to train the CNN test. The train folder was filled with 4516 images, and the test
model. The initial task was to categorize the images into folder with 1000 images. Each folder was further divided into
two primary folders: healthy and unhealthy. This manual two subfolders: healthy and unhealthy. Namely, the train folder
separation process was very intensive since the feeder had contains 2016 healthy images and 2500 unhealthy images.
to go through the images and sort them according to the Likewise, it was just as split with 500 healthy test images and
identified category for each image. Figure.3. shows the 500 unhealthy test images. This was done to maintain close to
methods for splitting and preprocessing the train test data. an 80/20 split of the data, where nearly 80% of data is used to
train the model while a minimum of 20% is used to test the
We collect 80% of the
trained model.Given that the images were captured using
field data data is used mobile cameras, their sizes varied significantly. To ensure
manually and to train the
have gathered model, and
uniformity and facilitate effective model training, we resized all
10,000 20% is used the images to a standard dimension of 255x255 pixels. This
images. for validation.
resizing was accomplished using the Python Pillow library, a
powerful tool for image processing. This step was vital for
maintaining consistency in the input data, thereby enhancing the
4,516 images All images
performance and accuracy of the CNN model in detecting
have been have been broccoli leaf diseases.
labeled into resized to
'Healthy' and 255 x 255
'Unhealthy' pixels. C. CNN Model Creation and Evaluation
categories.
This study used TensorFlow and Keras to develop the CNN
Figure 3: Train Test Data split and preprocessing model for identifying broccoli leaf diseases. Here is a
methodology pictorial representation of the CNN model's workflow,
broken down into individual steps displayed in Figure 5.
The process
Overall, we could tag 5516 photos with classifications of
healthy and unhealthy leaves. To organize better the data for the
training process, we established two primary folders: Test and

Figure 4: Sample Leaves and Folder Structure

Innovative Research Publication 112


International Journal of Innovative Research In Engineering and Management (IJIREM)

started with loading the images in the ‘train’ and ‘test’


directories using the `image_dataset_from_directory` tool
from Keras. To make the best of the data, it was split into
training and validation with a batch size of 32 and an image
size of 255*255 pixels. The display of Sample Leaves and
Folder Structure is shown in Figure 4. This method
inferred the labels from the director structure.
After that, we normalized the pixel values and divided
every pixel by 255 to bring the values into a range of 0 to
1. A particular function was created to change the pixel
values to float for this normalization. They normalized the
images, which made this input data uniform for the model.
CNN was developed using the Sequential API from Keras.
The architecture components included the following: We
also had the Conv2D layer with 32 filters followed by
ReLU activation with a kernel size 3*3. An activation layer
succeeded. It is called the MaxPooling2D layer, with a pool
size of 2 nodes in a row. This pattern was followed by the
use of more filter sizes of 64 and then 128 in other layers of
Conv2D, each followed by MaxPooling2D layers. These
CNN and pooling layers were helpful for feature extraction
from the images. After convolutional layers, the model was
composed of a flattened layer that flattened the feature
maps into 1D feature vectors. It was overcome by two
Dense layers with a unit parameter of 128 and 64 with
ReLU activation. The last layer for this model was a Dense
layer with one neuron and sigmoid activation to classify the
given leaves as either healthy or unhealthy. The model was
trained using the Adam optimizer, binary cross-entropy
lossfunction, and accuracy was used as the metric. The
training process was carried out for ten epochs, and the
model’s accuracy was tested based on the validation
dataset. Training epoch analysis demonstrated a consistent
increase in accuracy from the initial 61.43%
in the initial epoch to 100% in the last epoch of training.
The graphical representations of train accuracy, train loss,
train loss and validation accuracy, train loss and validation
loss, and validation accuracy and validation loss, which
displayed within Figures 6, 7, 8, and 9, are directly relevant
to our analysis. As can be observed in the graph, a similar
trend was observed for validation accuracy, which rose
rapidly from 72.80% to 100% by the tenth epoch. The final
model that was created can accurately distinguish healthy
from unhealthy broccoli leaves, and the accuracy, as
expected, was 100% on both the training and validating
sets.

Figure 5: Pictorial Representation of CNN model

Innovative Research Publication 113


International Journal of Innovative Research In Engineering and Management (IJIREM)

V. VISULIZATION AND MODEL


ACTIVATION
Further information was obtained from the convolutional
layers' heat maps to analyze the training process of the
CNN model in detail and determine how it recognizes
healthy and unhealthy broccoli leaves. This visualization
helps decipher which areas of the images these models
focus on when making their predictions. To do so, we
created an activation model using TensorFlow and Keras
that, given the same input as the original model, outputs the
activations from the first convolutional layer. In this
experiment, we applied this activation model by passing a
Figure 6: Train Accuracy and Train Loss sample image from the test dataset to the model to acquire
the activations. In the feature maps below, the observed
blobs signify the features the model has learned to identify
in the first convolutional layer activations.

Figure 7: Train Accuracy and Validation Accuracy Figure 10: Heatmap of Unhealthy leaf using bone colormap

These feature maps were illustrated in a figure composed


of several sub-figures; each figure represented a filter of
the convolutional layer. Utilizing a bone colormap, Figure
10 displays a heatmap of an unhealthy leaf. Heatmap of a
healthy leaf is displayed in Figure 11 using a grayscale
colormap. The learned activations were shown using the
‘grey’ and ‘bone’ colormap for improved contrast/clarity.
This process gave a clear representation of how the filters
of the convolutional layer interpreted the various areas of
the input image to arrive at the classification of the pictures
of leaves.

Figure 8: Train Loss and Validation Loss

Figure 11: Heatmap of Healthy leaf using gray colormap

Figure 9: Validation Accuracy and Validation Loss


architectural or training process improvement
This visualization method not only helps to increase the opportunities.
model's interpretability but also allows one to spot potential

Innovative Research Publication 114


International Journal of Innovative Research In Engineering and Management (IJIREM)

VI. RESULT AND DISCUSSION pictures was also 100%, the model can differentiate
between the healthy and unhealthy leaves:
A. Confussion Matrix
Confusion matrix is a robust measure that can used to  Precision: Indicates the accuracy of positive
determine the effectiveness of classifiers. It evaluates the predictions.
model performance by stating the number of correctly and 𝑇𝑃 486
Precision = = = 1.0
wrongly classified groups. It aids in determining not just 𝑇𝑃+𝐹𝑃 486+0

the errors of the model and the degree of success but the  Recall (Sensitivity): Measures the model's ability to
discrepancies, too.BY generating a confusion matrix correctly identify positive instances.
𝑇𝑃 486
concerning the predictions of the CNN model on the Recall = = = 1.0
𝑇𝑃+𝐹𝑁 486+0
validation set, we were able to assess the model's  F1 Score: Harmonic mean of precision and recall,
performance in detecting diseases affecting the broccoli providing a balanced metric for evaluation.
leaf. The matrix shows that the actual labels are compared 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∙𝑟𝑒𝑐𝑎𝑙𝑙 1 ∙1
F1 Score = 2 ∙ 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙 = 2 ∙ 1 +1 = 1.0
with the predicted labels to display the model on each
class. It is the confusion matrix for the model mentioned The F1 scores in the above calculation are high at 1.00,
above: representing the summative performance of F1 that
reflects the class balance. Figure 12 depicts the
confusion matrix. In the confusion matrix and derived
metrics, we observe that:
 The model performs exceptionally well in
identifying healthy and unhealthy leaves with
perfect precision and recall.
 There are no misclassifications, indicating that the
model is highly accurate and reliable.
 The overall F1 score 1.00 suggests a balanced and
perfect performance across different classes.

VII. CONCLUSION
In this study, we created a feature extraction model based
on the Convolutional Neural Network (CNN) for disease
diagnosis on broccoli leaves. In the present method, our
Figure 12: Confusion Matrix strategy entailed the collection of a vast dataset, where we
collected 10000 images from different regions and various
seasons to make the selection broad-based and inclusive.
The actual class is represented along the rows of the The raw data was thoroughly preprocessed, accompanied
matrix, and the predicted class is defined along the columns by a healthy and an unhealthy leaves dataset, which was
of the matrix. Diagonal elements show the number of also separated into training and testing data. As for the
correct classifications, and off-diagonal elements show the artificial neural network, the CNN model constructed
number of incorrect classifications. with TensorFlow and Keras achieved perfect scores for
 True Positives (TP): 486 The model correctly predictive accuracy equal to 100% in the training and
forecasted the scenarios over the healthy leaves. validation data after ten iterations. The presented
 True Negatives (TN): The value of being right architecture with several convolutional, pooling and dense
indicates 410 cases where the model gave the right layers helped the model capture all required features to
outcomes on unhealthy leaves. provide relevant classification. Also, the transfer of images
 False Positives (FP): No false positives about the into heatmaps used for visualization helped to comprehend
classification of unhealthy leaves as healthy by the which features were learned by the model and interpret this
result more efficiently. They found that broccoli farming is
model.
economically profitable to farmers and makes more profits
 False Negatives (FN): No case was identified where than other crops, such as cabbage and cauliflower. Using
the model classified a healthy leaf as an unhealthy one. image processing methods to detect diseases early can
B. Accuracy: minimize damage to crops and financial loss. In this
respect, our model looks highly promising and may be used
𝑇𝑃+𝑇𝑁 486+410
= = 1.0 as an efficient means of disease early detection.
𝑇𝑃+𝐹𝑃+𝑇𝑁+𝐹𝑁 486+0+410+0
Nevertheless, there are specific directions for further
research and improvement of the model, due to which such
high accuracy has been obtained. Firstly, expanding the
dataset's size and variability by photos from different zones
C. Performance Metrics taken in various weather conditions also improves the
model's performance. Secondly, more elaborate procedures,
Since accuracy and other objectives following GOAL
including transfer learning and data augmentation, could be
were set equal to 100% and the number of recalled
studied which would enhance the result and minimize

Innovative Research Publication 115


International Journal of Innovative Research In Engineering and Management (IJIREM)

overfitting risks. However, adopting this model into real- ABOUT THE AUTHORS
world applications like mobile apps or automated field
Saikat Banerjee is currently
monitoring could be beneficial to farmers in diagnosing employed as a teacher at Vivekananda
diseases early enough. In practical agriculture, integrating Mahavidyalaya, a state-aided college in
this system with IoT devices and remote sensing could the department of BCA. He is also
provide real-time monitoring and decision-making pursuing his Ph.D. in Computer Science
capability, improving food systems' sustainability and at the University of Burdwan, located in
Burdwan, West Bengal, India. He
output. Thus, the trained CNN model is a possible
obtained his Bachelor of Science degree
approach to recognizing Broccoli leaf diseases with several with a specialization in Computer
purposes in discriminating against other crops and plant Science and his Master of Computer
diseases. More work in this field would help to support Application (MCA) award in 2013 from
precision agriculture, increase agricultural output, and the University of Burdwan in West
lower crop failure via diseases. Bengal, India. He possesses more than 11
years of teaching experience. He has
published several articles in various
CONFLICTS OF INTEREST reputable journals and conferences. His
research interests encompass a variety of
The authors declare that they have no conflicts of interest.
topics, such as deep learning, soft
computing, artificial intelligence, and
REFERENCES machine learning.
[1] C. Sarkar, D. Gupta, U. Gupta, and B. B. Hazarika, "Leaf Soumitra Das completed MSc in
disease detection using machine learning and deep learning: Computer Science at the University of
Review and challenges," Applied Soft Computing, vol. 145, Burdwan in Burdwan, West Bengal,
p. 110534, 2023, doi: 10.1016/j.asoc.2023.110534. India. He earned his Bachelor of
Available from: Computer Application in 2022 from the
https://doi.org/10.1016/j.asoc.2023.110534 Kazi Nazrul University in West Bengal,
[2] S. S. Harakannanavar, J. M. Rudagi, V. I. Puranikmath, A. India. He has more than two years of
Siddiqua, and R. Pramodhini, "Plant leaf disease detection development and data analysis
using computer vision and machine learning algorithms," experience. He has authored numerous
Global Transitions Proceedings, vol. 3, no. 1, pp. 305-310, publications in several esteemed journals
2022, Available from: and conferences. He is presently a
https://doi.org/10.1016/j.gltp.2022.03.016 software developer and machine learning
[3] K. Prabavathy, M. Bharath, K. Sanjayratnam, N. S. S. R. enthusiast. His research interests include
Reddy, and M. S. Reddy, "Plant Leaf Disease Detection deep learning, soft computing, artificial
using Machine Learning," in 2023 2nd International intelligence, and machine learning.
Conference on Applied Artificial Intelligence and
Computing (ICAAIC), Salem, India, 2023, pp. 378-382, Dr. Abhoy Chand Mondal
https://doi.org/10.1109/ICCCNT45670.2019.8944556 is presently a Professor and Head of the
[4] J. Yao, S. N. Tran, S. Sawyer, et al., "Machine learning for Department of Computer Science at the
leaf disease classification: data, techniques and University of Burdwan in Burdwan,
applications," Artificial Intelligence Review, vol. 56, Suppl. West Bengal, India, where he also serves
3, pp. 3571-3616, 2023, Available from: as the Head of the Department of
https://doi.org/10.1007/s10462-023-10610-4 Computer Science. In 1987, he graduated
[5] K. Joshi, R. Awale, S. Ahmad, S. Patil, and V. Pisal, "Plant with a Bachelor of Science in
Leaf Disease Detection Using Computer Vision Techniques Mathematics with honors from The
and Machine Learning," ITM Web Conf., vol. 44, p. 03002, University of Burdwan. In 1989 and
2022. Available from: 1992, he earned a Master of Science in
https://doi.org/10.1051/itmconf/20224403002 Mathematics and MCA from Jadavpur
[6] J. Andrew, J. Eunice, D. E. Popescu, M. K. Chowdary, and University. In 2004, he obtained a
J. Hemanth, "Deep Learning-Based Leaf Disease Detection doctoral degree from Burdwan
in Crops Using Images for Agricultural Applications," University. He also has 28 years of
Agronomy, vol. 12, no. 10, p. 2395, 2022. Available from: experience teaching and researching and
https://doi.org/10.3390/agronomy12102395 one year of work experience in the
[7] U. Fulari, R. Shastri, and A. Fulari, "Leaf Disease Detection sector. More than 120 articles and more
Using Machine Learning," in 15th IEEE International than 60 journals were published. His
Conference, 2020, pp. 1828-1832. Available from: areas of study interest include fuzzy
https://doi.org/10.1109/ICCCNT45670.2019.8944556 logic, soft computing, document
[8] W. B. Demilie, "Plant disease detection and classification processing, natural language processing,
techniques: a comparative study of the performances," natural language processing, big data
Journal of Big Data, vol. 11, p. 5, 2024. Available from: analytics, machine learning, deep
https://doi.org/10.1186/s40537-023-00863-9 learning, and other areas.

Innovative Research Publication 116

You might also like