0% found this document useful (0 votes)
11 views9 pages

Research Paper Organized

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

Research Paper Organized

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

VIT Bhopal University, Bhopal, Madhya Pradesh, India

Design and Analysis of Potato leaves detection system using


Sahil Mittal1, Shrey Purushottam Deotare2, Abhay Jhalani3, Aditya Kaushik4, Akshay Singh
Bani5, Dr. Vijay Kumar Trivedi6
1
SCSE, VIT Bhopal University, Bhopal, Madhya Pradesh, India
[email protected]
2
SCSE, VIT Bhopal University, Bhopal, Madhya Pradesh, India
[email protected]
3
SCSE, VIT Bhopal University, Bhopal, Madhya Pradesh, India
[email protected]
4
SCSE, VIT Bhopal University, Bhopal, Madhya Pradesh, India
[email protected]
5
SCSE, VIT Bhopal University, Bhopal, Madhya Pradesh, India
[email protected]
6
SCSE, VIT Bhopal University, Bhopal, Madhya Pradesh, India

ABSTRACT
Advancements in deep learning have transformed picture categorization tasks, especially in agriculture for disease
diagnosis. This research presents a comparative examination of a bespoke Convolutional Neural Network (CNN)
and four well-known pre-trained models—VGG16, ResNet50, MobileNet, and EfficientNet—using a subset of the
Plant Village dataset focusing on potato leaf disease. The dataset consists of three classes: early blight, healthy,
and late blight, with a total of 2,152 pictures. The models were trained under identical settings to test their
effectiveness on classifying leaf health. Metrics like as accuracy, mean absolute error (MAE), and Kappa statistics
were applied to assess model performance.

The new CNN was created to extract relevant features from the images, while the pre-trained models, known for
their architecture depth and robustness, were fine-tuned for the same objective. This study tries to determine the
most efficient model, balancing accuracy and computing costs. The results reveal that while the pre-trained
models performed extraordinarily well because to their architecture and transfer learning capabilities, the bespoke
CNN also displayed competitive results with reduced computational overhead. The insights from this research are
expected to aid in selecting acceptable models for real-world applications in plant disease detection, delivering a
useful addition to precision agriculture.

I. INTRODUCTION

In recent years, the rapid growth of artificial data, thus decreasing the dependency on manual feature
intelligence (AI) and machine learning (ML) has engineering.
revolutionized various sectors, particularly in computer
vision, where the requirement for reliable picture Despite the success of CNNs, there remains a continuous
classification systems has become increasingly crucial. issue in designing model topologies to attain high accuracy
The enormous rise of visual data created across many while minimizing computational costs. Various
platforms—from social media to medical imaging— architectures, such as VGG16, ResNet50, MobileNet, and
has prompted researchers to construct sophisticated EfficientNet, have been presented, each bringing unique
techniques and breakthroughs that address these difficulties.
algorithms capable of efficiently and precisely
These pre-trained models have showed state-of-the-art
processing images. Convolutional Neural Networks performance on benchmark datasets; nevertheless, their
(CNNs) have emerged as a dominating paradigm in usefulness might vary greatly depending on the nature of
this field, partly due to their ability to automatically the task and the unique properties of the dataset employed.
train and extract hierarchical features from raw picture

1
II. LITERATURE REVIEW

The introduction of artificial intelligence (AI) and


machine learning (ML) has substantially revolutionized
several disciplines, particularly in the field of computer
vision. As the demand for precise and efficient image
recognition systems develops, the development of new
algorithms has become important. Among the most
effective approaches to tackling picture classification
challenges are deep learning techniques, notably
Convolutional Neural Networks (CNNs). CNNs have
emerged as a leading methodology due to its capacity to
automatically learn hierarchical features from raw image
data, minimizing the requirement for manual feature
extraction and providing exceptional accuracy in
classification tasks.

1. Custom Model
The custom model developed in this research leverages
the principles of CNNs while introducing modifications
This research article seeks to conduct a comparison tailored to the specific dataset and application domain.
analysis of a custom CNN model versus these pre-trained By employing techniques such as data augmentation,
architectures to evaluate their performance in picture regularization, and customized architectures, the model
classification tasks. The bespoke model, developed with aims to enhance generalization and robustness. The
personalized features and optimizations, tries to balance architecture is designed to balance complexity and
accuracy and computational efficiency while being interpretability, allowing it to efficiently capture
trained on a consistent dataset alongside the essential features while minimizing overfitting.
aforementioned pre-trained models. Furthermore, the model's performance is benchmarked
against established pre-trained architectures to assess
Furthermore, this study investigates different its efficacy and potential advantages. This comparative
performance indicators, including accuracy, mean approach not only highlights the strengths of the
absolute error (MAE), and Kappa statistics, offering a custom model but also contributes to the broader
full review of each model's usefulness. By examining the discourse on model selection in deep learning.
strengths and shortcomings of each architecture, this
research strives to give significant insights into model
selection and optimization for practitioners in the field. 2. VGG16 Model
VGG16, created by Simonyan and Zisserman in
In summary, as the landscape of computer vision 2014, is a remarkable architecture noted for its
continues to grow, understanding the complexities of simplicity and depth. The model has 16 layers,
multiple CNN architectures becomes crucial. This study defined by tiny convolutional filters (3x3), which
will not only emphasize the comparative performance of allows it to learn sophisticated spatial hierarchies.
the custom model but also offer a deeper knowledge of VGG16 earned amazing results in the ImageNet
how architectural choices effect outcomes in real-world
competition, indicating that depth in CNNs is
applications. The findings aim to increase the knowledge
base in deep learning and provide recommendations for critical for capturing complicated characteristics.
future research and practical implementations. The model's universal architecture makes it easy to
implement, while its depth needs large
computational resources, notably in terms of
memory and processing power. Despite its
drawbacks, VGG16 serves as a core framework for
many subsequent models and remains useful in
numerous computer vision applications.

2
3. ResNet50 Model III. RESEARCH METHODOLOGY
ResNet50, developed by He et al. in 2015,
revolutionized CNN architectures by introducing the
A. DATASET ATTRIBUTES
concept of residual learning through skip connections.
This architecture overcomes the vanishing gradient
problem typically found in deep networks, enabling for The dataset used in this research was derived from the
the training of extraordinarily deep models (up to Plant Village dataset on Kaggle[1], primarily
hundreds of layers). ResNet50 consists of 50 layers and concentrating on the potato leaf subset. This collection
displays outstanding performance on picture contains photos divided into three classes: early_blight,
classification tasks, reaching state-of-the-art results onhealthy, and late_blight. The dataset consists of 1000
the ImageNet dataset. The insertion of residual blocks images of potato leaves affected by early blight, 152
not only promotes accuracy but also improves images of healthy leaves, and 1000 images of leaves
convergence speed during training. ResNet's modularity affected by late blight, offering a balanced representation
and versatility make it a favorite choice for transfer of sick and healthy situations. This curated dataset is
learning and fine-tuning across varied applications. appropriate for training convolutional neural networks
(CNNs) to categorize potato leaf states and offers a useful
4. MobileNet foundation for assessing different model architectures in
MobileNet, developed by Howard et al. in 2017, is built plant disease detection.
primarily for mobile and embedded vision applications.
This architecture employs depth wise separable In our project, there are various attributes and
convolutions, which factorize ordinary convolutions into hyperparameters employed across multiple portions,
a depth wise convolution followed by a pointwise ranging from dataset properties, model architecture,
convolution. This strategy greatly reduces the number of training procedure, and performance measures. Here's a
parameters and computational cost, enabling effective breakdown of the important attributes:
model deployment on resource- limited devices.
MobileNet maintains a balance between model size and 1. Dataset Attributes:
precision, making it suited for real-time applications.  IMAGE_SIZE: The size of the images after
The architecture has attracted considerable use due to its resizing (256x256 pixels).
lightweight nature and efficacy in many picture  BATCH_SIZE: The number of images used per
categorization tasks, especially in mobile situations. training batch (32).
 CHANNELS: The number of color channels in
5. EfficientNet the images (3 for RGB).
EfficientNet, introduced by Tan and Le in 2019, offers a  class_names: List of the class labels in the dataset.
breakthrough in neural architecture design, optimizing
performance through a compound scaling mechanism. 2. Data Preprocessing and Augmentation:
EfficientNet grows up the model size equally across
depth, width, and resolution, achieving state-of-the-art  Rescaling: Scaling pixel values to the range [0,
accuracy while being computationally efficient. The 1] by dividing by 255.
architecture is built upon a baseline model that leverages  Resizing: Resizing the input images to the
Mobile Inverted Bottleneck Convolutions (MBConv), specified IMAGE_SIZE (256x256).
boosting feature extraction while decreasing computing  RandomFlip: Flipping images randomly during
costs. EfficientNet's performance on benchmark datasets data augmentation.
surpasses that of earlier architectures, making it an  RandomRotation: Randomly rotating images
appealing candidate for picture classification and other (0.2 radians).
computer vision applications. The model's economy and
accuracy underline the possibility for constructing more 3. Model Architecture Attributes:
compact and powerful models for real-world
applications. a. Convolution Layers:
 Number of Conv2D layers: 6 layers in the
custom model.
 Filters: 32 filters in the first Conv2D layer, 64
filters in the subsequent layers.
 Kernel Size: (3x3) in all Conv2D layers.

b. Activation Functions:
3
 ReLU: Used as the activation function in
convolution and dense layers. To acquire deeper insights into model performance,
 Softmax: Used in the output layer for multi- confusion matrices were also produced for each model
class classification. (Figures 5–9). The confusion matrix displays the true
positive, false positive, true negative, and false negative
c. Pooling Layers:
values, allowing a clearer perspective on how
 MaxPooling2D: Pool size of (2x2) after each
effectively the models classified each of the three
convolution layer to reduce spatial dimensions.
categories—early blight, healthy, and late blight. These
matrices provide a more detailed view of model
d. Dense Layers:
accuracy by identifying specific areas where the
 Flatten: Converts the 2D feature maps to 1D
models may struggle (e.g., misclassification between
for the dense layers.
early blight and late blight). The values in the
 One Dense layer with 128 units (fully
confusion matrices are particularly valuable for
connected).
producing metrics such as Kappa statistic and can assist
in spotting model bias toward certain classes.
e. Output Layer:

A Dense layer with the number of units equal to the By examining the confusion matrices with the other
number of classes (n_classes = 3), using softmax assessment measures, we can better comprehend the
activation. precision, recall, and overall classification effectiveness of
each model, ensuring a full comparison of their
4. Training Attributes: performance.

 EPOCHS: The number of complete passes 6. Comparison of Pre-Trained Models:


through the training data (50 epochs).
 Optimizer: Adam optimizer (used  Pre-trained models used:
for optimization during training).  VGG16: A 16-layer convolutional neural
 Loss Function: Sparse Categorical network.
Crossentropy (for multi-class classification).  ResNet50: A 50-layer residual network.
 Metrics: Accuracy (used for  MobileNet: A lightweight neural network for
monitoring performance during training). mobile devices.
 Shuffle Size: Shuffle size during training (1000).  EfficientNet: A state-of-the-art convolutional
 Buffer Size: Buffer size for data prefetching neural network known for efficiency.
(tf.data.AUTOTUNE).
A major element of this research is comparing the
5. Evaluation Metrics: bespoke CNN model with four pre-trained models—
VGG16, ResNet50, MobileNet, and EfficientNet—using
 Accuracy: Measures the percentage of identical conditions and evaluation measures. This
correctly predicted samples. comparison is shown in Table 1, which provides a full
 Precision: Measures the proportion of true overview of each model’s performance in terms of
positive predictions out of all positive accuracy, loss, Kappa statistic, MAE, F1-score,
predictions. precision, and the average duration per epoch.
 Recall: Measures the proportion of true The comparison illustrates the balance between accuracy
positive predictions out of all actual positives. and computing efficiency, as demonstrated in the
 F1-Score: The harmonic mean of precision average time per epoch for each model. While pre-
and recall. trained models like EfficientNet and ResNet50 display
 Training Time: Total time taken to train the superior accuracy, the bespoke CNN gives competitive
model. outcomes with reduced computing costs. The F1-score
 Average Time per Epoch: Average time taken and precision metrics offer insights into each model's
for each epoch during training. performance across the three classes—early blight,
healthy, and late blight—by analyzing the balance
In this work, numerous critical metrics were employed to between precision and recall. Additionally, the Kappa
evaluate the performance of both the custom CNN model statistic, which assesses the agreement between
and the pre-trained models (VGG16, ResNet50, predicted and true labels above chance, is proven to be
MobileNet, and EfficientNet). These indicators include greater in models like EfficientNet and ResNet50,
accuracy, mean absolute error (MAE), Kappa statistic, suggesting their improved classification capabilities.
and training time, offering a full picture of the models' This full comparison, as presented in Table 1, permits us
strengths and limits. to evaluate not only the classification accuracy of each
4
model but also their efficiency in terms of time and It provides a more robust measure than accuracy,
resources, making it easier to choose the best suited especially for imbalanced datasets where the majority
model for real-world applications. class could dominate the accuracy score.

B. KAPPA STATISTICS C. MEAN ABSOLUTE ERROR (MAE)

The Kappa statistic, also known as Cohen's Kappa, is Mean Absolute Error (MAE) is a statistical metric
a metric used to measure the agreement between two used to measure the average magnitude of errors
raters or models, adjusted for the agreement occurring between predicted values and actual values in a
by chance. In machine learning, it's often used to assess regression model or in any predictive modeling task. It
the agreement between the true labels and predicted calculates the average of the absolute differences
labels of a model, particularly in classification tasks. between the predicted values and the actual value.

Interpretation of Kappa:
 1.0: Perfect agreement.
 0: Agreement equivalent to chance.
 Negative values: Worse than random guessing.

Table 1: Comparison of Machine Learning Model with Custom model on large dataset.
Model Name Time Per Accuracy Loss Attributes No of Kappa MAE Precision F-1
Epoch (%) (%) Instances Statistics Score
(Sec)
Custom 30.94 97.70 1.30 9 2152 0.083 0.55 0.981 0.980
Model
VGG16 139.88 98.04 1.96 9 2152 -0.0084 0.62 1.0 1.0
Model
ResNet50 113.51 67.97 32.03 9 2152 0.037 0.68 0.879 0.847
Model
MobileNet 48.97 99.22 0.78 9 2152 0.141 0.52 0.996 0.996
Model
EfficientNet 66.13 46.10 53.90 9 2152 0.0 0.54 0.198 0.274
Model

Fig 1: Average Training Time Per Epoch for Each Model. Fig 2: Test Accuracy for each Model.

5
Fig 3: Training and Validation Accuracy

Fig 4: Receiver Operating Characteristics (ROC) Curve

6
Fig 5: Confusion Matrix of Custom Model Fig 6: Confusion Matrix of VGG16 Model

Fig 7: Confusion Matrix of ResNet50 Model Fig 8: Confusion Matrix of MobileNet Model

Fig 9: Confusion Matrix of EfficientNet Model

7
model’s complexity being unnecessary for the task. While
D. DISCUSSION ResNet models are frequently excellent in deep feature
extraction, they require more data and fine-tuning to work
effectively, which may explain its underperformance here.
The results described in Table 1 reveal variable
performances across the models—both custom and pre-
trained—on the provided huge dataset. Each model MobileNet:
demonstrates strengths and limits across key measures The MobileNet model displays the highest accuracy of
like accuracy, loss, time per epoch, and kappa statistics. 99.22% with the lowest loss (0.78%). It also trains quite
quickly (48.97 seconds each epoch). MobileNet’s
1. Performance of the Custom Model lightweight architecture allows it to excel in both
accuracy and efficiency, making it extremely ideal for
The custom model demonstrates remarkable performance edge devices and mobile applications. This result
with an accuracy of 98.70% and a low loss of 1.30%. Its illustrates MobileNet’s ability to generalize well across
training efficiency is particularly impressive, taking only datasets without losing speed.
30.94 seconds per epoch, which is much faster than
VGG16 (139.88 seconds) and ResNet50 (113.51
EfficientNet:
seconds). This illustrates that the bespoke model, despite
EfficientNet, despite its promising architecture,
being simpler, is well-tuned for the specific dataset at
performs poorly with 46.10% accuracy and a loss of
hand.
53.90%. These results show that EfficientNet may
require more data or hyperparameter adjustment to
perform effectively on this specific dataset. EfficientNet
The custom model’s strong performance suggests that:
is often known for balancing efficiency and accuracy,
 It may be well-suited for the dataset’s attributes,
however in this situation, its default setup may not be
potentially due to optimal feature extraction for
well-aligned with the dataset features.
the problem space.
 The relatively small training time makes it
3. Kappa Statistic and MAE Insights
suitable for fast prototyping or real-time
applications where speed is critical.
The Kappa Statistic displays how well forecasts
 However, the Kappa Statistic (0.083), while
correlate with the actual classes beyond random chance.
positive, is not highly significant, suggesting
The highest result for MobileNet (0.141) demonstrates
that there could be room for further
its outstanding reliability. In contrast, VGG16’s negative
improvement in class agreement.
kappa statistic shows low inter-class agreement,
indicating concerns with model consistency across
2. Comparison with Pre-Trained
labels.

Models VGG16:
The Mean Absolute Error (MAE) for all models
While VGG16 achieves accuracy, similar to the custom remained within a modest range (0.52–0.68). However,
model’s performance, it suffers from a substantially MobileNet and the custom model have somewhat lower
higher time per epoch (139.88 seconds). This correlates MAE values, indicating improved forecast stability.
with VGG16’s architecture, recognized for being
computationally intensive because to its deep network IV. CONCLUSION
layers. Although effective in many cases, VGG16 might
not be the best choice for this dataset if processing
Based on the comparison research, MobileNet delivers
efficiency is prioritized.
the optimum trade-off between accuracy, loss, and
ResNet50: training time. However, the custom model still shows to
ResNet50 exhibits a substantially lower accuracy be a tempting alternative, especially where computing
(67.97%) with a high loss of 32.03%, indicating that the performance is crucial. The bespoke model’s reduced
architecture might not generalize well on this dataset. training period makes it more suitable for time-sensitive
This could be due to overfitting difficulties or the situations, and its performance of 98.70% accuracy is just
significantly lower than MobileNet's.
8
and Aerospace Technology. ICECA 2017, 978-1-5090-5686-
Each model offers distinct advantages and challenges: 6/17/$31.00 ©2017 IEEE.
[4] Ahmad, N., Singh, S. (2021). Comparative Study of Disease
Detection in Plants using Machine Learning and Deep Learning.
a. Custom Model: Second International Conference on Secure Cyber Computing
and Communication (ICSCCC), © IEEE. Amrita, S., Raul, T.N.
 Advantages: Fast training, high accuracy, well- (2019). Plant Leaf Disease Detection using Machine Learning”
suited for smaller or specific datasets. 10th ICCCNT 2019, IIT - Kanpur, Kanpur, India
 Disadvantages: May not generalize well to other [5] Aruraj, A., Alex, A., Subathra, M.S.P., Sairamya, N.J., George,
S.T., Eward, S.E.V. (2019). Detection and Classification of
datasets or complex tasks. Diseases of Banana Plant Using Local Binary Pattern and
b. VGG16: Support Vector Machine, 2019 International Conference on
Signal Processing and Communication (ICSPC -2019), March.
 Advantages: Good accuracy, effective for deep 29-30, 2019, Coimbatore, India. 978-1-7281- 1849-9 ©2019
feature extraction. IEEE.
 Disadvantages: Computationally intensive, [6] Asta, L., Gomathi, M.V. (2021). Automatic Prediction of Plant
Leaf Diseases Using Deep Learning Models: A Review. 5th
longer training times. International Conference on Electrical, Electronics,
c. ResNet50: Communication, Computer Technologies and Optimization
Techniques (ICEECCOT).
 Advantages: Strong in feature extraction for [7] Bhimte, N.R. and Thool, V.R. (2018). Diseases Detection of
complex datasets. Cotton Leaf Spot Using Image Processing and SVM Classifier.
 Disadvantages: Requires extensive data and Second International Conference on Intelligent Computing and
Control Systems (ICICCS), Madurai, India, pp. 340- 344. DOI:
fine- tuning, underperformed on this dataset. 10.1109/ICCONS.2018.8662906.
d. MobileNet: [8] Chandraprabha, M.T., Singh, A.S. (2021). Prediction of Crop
Diseases to Improve Crop Yield Using Ensemble Methods. 2021
 Advantages: High accuracy, lightweight,
3rd International Conference on Advances in Computing,
efficient for mobile and embedded applications. Communication Control and Networking (ICAC3N) | 978- 1-
 Disadvantages: May have limited scalability for 6654- 3811-7/ 21/$ 31.00 2021 IEEE | DOI: 10.1109/
ICAC3N53548.2021.9725472.
large-scale models. [9] Davoud, A., Aghighi, H., Matkan, A.A., Mobasheri, M.R. and
e. EfficientNet: Rad, A.M. (2016). An investigation into machine learning
 Advantages: Designed to balance performance regression techniques for the leaf rust disease detection using
hyperspectral measurement. IEEE Journal of Selected Topics in
and efficiency, scalable. Applied Earth Observations and Remote Sensing. 9(9): 4344-
 Disadvantages: Poor performance on this 4351, 2016 1939-1404 IEEE.
[10] Marwan Adnan Jasim and Jamal Mustafa AL-Tuwaijari , “Plant
dataset, requiring more tuning and data. Leaf Diseases Detection and Classification Using Image
Processing and Deep Learning Techniques”, 2020 International
In conclusion, the custom model and MobileNet emerge Conference on Computer Science
[11] Poojan Panchal, Vignesh Charan Raman and Shamla Mantri ,
as the top-performing choices for the given dataset. “Plant Diseases Detection and Classification using Machine
MobileNet’s high accuracy and low loss make it ideal Learning Models”, IEEE 2019
[12] Gaurav Langar, Purvi Jain and Nikhil, “Tomato Leaf Disease
for applications where precision matters most. However, Detection using Artificial Intelligence and Machine Learning”,
the custom model’s shorter training time suggests that it International Journal of Advance Scientific Research and
could be favored for time-constrained or resource- Engineering Trends, 2020
[13] Gittaly Dhingra & Vinay Kumar & Hem Dutt Joshi ”Study of
limited environments. digital image processing techniques for leaf disease detection
and classification” © Springer Science+Business Media, LLC,
part of Springer Nature 2017 https://doi.org/10.1007/s11042-
V. REFERENCES 017-5445- 8
[14] S. Sladojevic, M. Arsenovic, A. Anderla, D. Culibrk, and D.
[1] Tejaswi, A. (2018). Plant Village Dataset. Kaggle. Stefanovic, “Deep neural networks based recognition of plant
https://www.kaggle.com/datasets/arjuntejaswi/plant-village diseases by leaf image classification,” Computational
Intelligence and Neuroscience, vol. 2016, Article ID 3289801,
[2] Abdulridha, J., Ehsani, R., Abd-Elrahman, A., Ampatzidis, Y.
11 pages, 2016
(2019). A remote sensing technique for detecting laurel wilt
disease in avocado in presence of other biotic and abiotic
stresses. Computers and Electronics in Agriculture. 156: 549-
557. https://doi.org/10.1016/j.compag.2018.12.018.
[3] Adhao, A.S., Pawar, V.R. (2017). Machine learning regression
technique for cotton leaf disease detection and controlling using
IoT. International Conference on Electronics, Communication

You might also like