Automated Machine Learning Classification Framewor
Automated Machine Learning Classification Framewor
Automated Machine Learning Classification Framework to Predict Crop Yield and Detect Pest
Patterns
Gopi R1*, Tamil Selvi M2, Saranraj G3, Nagaraj P4, Parthiban K5 and Ranjith Kumar A6
1
Faculty of Computer Science & Engineering, Dhanalakshmi Srinivasan Engineering College, Perambalur–621212,
Tamil Nadu, India; 2Faculty of Computer Science and Engineering, Roever Engineering College, Perambalur-
621212, Tamil Nadu, India; 3Faculty of Artificial Intelligence and Data Science, Dhanalakshmi Srinivasan College of
Engineering, Coimbatore 105, Tamil Nadu, India; 4Faculty of Artificial Intelligence and Machine Learning, K.
Ramakrishnan College of Engineering, Samayapuram - 621112, Tamil Nadu, India; 5Faculty of Computer Science
and Engineering, Dhanalakshmi Srinivasan College of Engineering, Coimbatore - 641105, Tamil Nadu, India;
6
Faculty of Computer Science and Engineering, Lovely Professional University, Phagwara, Punjab, India
E-mail/Orcid Id:
GR, [email protected], https://orcid.org/0000-0003-4957-1843; TSM, [email protected], https://orcid.org/0009-0003-4642-2807;
SG, [email protected], https://orcid.org/0009-0000-6412-115X; NP, [email protected], https://orcid.org/0009-0007-9438-1973;
PK, [email protected], https://orcid.org/0009-0007-2385-5529; RKA, [email protected], https://orcid.org/0000-0003-4383-9212
Article History: Abstract: Plant disease identification is crucial to food security and agricultural product
Received: 01st
Jun., 2024 availability. Traditional disease diagnosis can be tedious, annoying, and inaccurate. The
Accepted: 18th Dec., 2024 investigation examines how modern machine learning algorithms might improve plant
Published: 30th Dec., 2024 disease diagnostics for efficacy and precision. Despite this, machine learning faces
Keywords: many obstacles, including model training, processing costs, and rising demand for large
Machine Learning, Plant data sets. This study proposes a novel method called Automated Machine Learning
Disease, Classification, Classification Framework (AMLCF) to predict crop yield and detect pest patterns. This
Crop yield, Reliability framework simplifies model selection, hyperparameter adjustment, and feature
engineering for non-experts. The amount of time and computational resources needed
have additionally been greatly reduced. The suggested AMLCF is evaluated on different
How to cite this Article: unique agricultural datasets to validate its plant disease detection versatility. Our
Gopi R, Tamil Selvi M, Saranraj G,
Nagaraj P, Parthiban K and Ranjith Kumar extensive simulation analysis found that AMLCF exceeds existing machine learning
A (2024). Automated Machine Learning
Classification Framework to Predict Crop methods in speed, accuracy, and usability. AMLCF's detailed demonstration shows this;
Yield and Detect Pest Patterns.
International Journal of Experimental
besides predicting plant illnesses, this system can predict crop yield and detect pests.
Research and Review, 46, 177-190. Those findings suggest AMLCF could transform farming. Better plant health
DOI:
https://doi.org/10.52756/ijerr.2024.v46.014 monitoring, early disease identification, and farmer selection could be achieved. The
experimental results show that the proposed AMLCF model increases the accuracy ratio
by 92.6%, computational efficiency analysis by 97.4%, versatility analysis by 98.3%,
user accessibility ratio by 99.1%, and crop health tracking analysis by 94.8% compared
to other existing models.
also shown in this paper. Season 2's unusual reactions methods, these contributions tackle important issues in
highlight the impact of certain characteristics on model contemporary agriculture, such as optimizing resources
performance, which is an intriguing development. This and controlling pests promptly.
study adds to the growing knowledge of optimizing Contribution 1: Design of AMLCF
ANNs for precise insect prediction in grain crops, which The following are three stages the recommended
helps create more effective and efficient pest AMLCF goes through to improve detection efficiency
management methods. It is also well-suited for real-world greatly: Auto-preprocessing, segmentation, and feature
applications due to the transformer model's persistent extraction. This simplified process reduces computation
dominance. resources and time needed for plant disease detection.
Moreover, state-of-the-art machine learning approaches
Materials and Methods used in this framework boost classification precision,
The suggested approach uses an AMLCF, or resulting in reliable and accurate identification of various
Automated Machine Learning Classification Framework, plant diseases. Improved crop health management
to improve plant disease detection. AMLCF's use of follows, as well as timely intervention actions.
advanced machine-learning techniques makes it easier for Figure 1 shows the data-gathering procedure used to
the system to capture images and the subsequent identify plant diseases within an AMLCF framework.
categorizing of diseases. The major goal is to increase The initial step involves capturing raw pictures of plants
agricultural production through improved efficiency and using cameras or sensors, which is called image
accuracy of plant disease detection, which helps crop acquisition. Each picture is then annotated to show
DOI: https://doi.org/10.52756/ijerr.2024.v46.014
180
Int. J. Exp. Res. Rev., Vol. 46: 177-190 (2024)
relevant information, such as the kind of ailing plant, are denoted by 𝑟 −𝑟(𝑎) and 𝑜4 .
after which it is stored in an annotated dataset. 𝑣3 𝑟(𝑎+𝑑)+𝑟(𝑎) 𝑣 𝑣1 2𝑟(𝑎+𝑑)
𝑣
𝑔 + 𝑣4 ℎ𝑟(𝑎+𝑑)−𝑟(𝑎) − 𝑣2
𝑓
2 2
Then, these photos are pre-processed during the image
(2)
processing phase to be better enhanced to find patterns. 𝑣3
For instance, this stage can utilize noise reduction, The equation 2 could be understood as 𝑣2
representing
𝑣4
contrast adjustments or segmentation methods to separate various plant health indicators , and 𝑔 and ℎ
𝑣2
areas of interest depicted in the photos from each other. 𝑣1
representing features extracted 𝑎 + 𝑑 from agricultural
Next comes feature extraction, where certain attributes or 𝑣2
features are recognized and extracted from the processed datasets. Possible representations 𝑟(𝑎) of the model's
pictures. These attributes could be texture, colour and sensitivity to changes in data or circumstances may be
form, among others. These characteristic features are found in the exponents of (𝑎 + 𝑑) and −𝑟(𝑎).
necessary for differentiating between different illnesses in This study's simulation environment was built using
plants. Lastly is classification, where machine learning MATLAB Simulink, a powerful tool for modelling,
methods are employed to sort the images into different simulating, and evaluating dynamic systems. The
illness categories based on the retrieved attributes. The deciding factors were the environment's adaptability in
model trained on the identified dataset to make illness incorporating machine learning models for prediction
predictions in fresh, unlabelled photos. Improved plant tasks and its capacity to manage massive agricultural
health surveillance and early disease diagnosis are made information. Using past data on crop yields, patterns of
possible by this simplified technique, which is made pest infestation, and weather variables like temperature,
possible by AMLCF, which increases the effectiveness humidity, and rainfall made the simulation seem like real-
and reliability of plant disease identification. life agricultural circumstances. Because of the system's
𝑠1 𝑜 𝑟(𝑎+𝑑) + 𝑜2 𝑟 −𝑟(𝑎+𝑑) = 𝑜3 𝑟 𝑟(𝑎) + 𝑜4 𝑟 −𝑟(𝑎) adaptability, seasonality and other regional considerations
(1) might be dynamically important. This study verified the
The Equation 1 may be connected to the suggested suggested framework's accuracy and resilience in this
Automated Machine Learning Classification Framework controlled setting, which minimized the time and money
𝑠1 as follows factors (𝑎 + 𝑑) or elements that are needed for actual field trials.
tweaked 𝑜2 throughout −𝑟, the procedure of machine Contribution 2: Implementation of Machine Learning
learning is given by 𝑎 + 𝑑 and 𝑟(𝑎), but the scores or Model selection, hyperparameter tweaking, and
indices modified to maximize the reliability of the model feature engineering are just a few of the frequent
DOI: https://doi.org/10.52756/ijerr.2024.v46.014
181
Int. J. Exp. Res. Rev., Vol. 46: 177-190 (2024)
machine-learning difficulties the AMLCF automates. The Equation 4 agreement with the suggested Automated
barrier to entry is lowered, and complex machine learning Machine Learning Classification Framework 𝑁(𝐹) .
methods are accessible to non-experts because of this Accuracy of disease identification analysis, such as
simplicity. More people will use and put the framework 1
𝑔
√ℎ − 𝑟𝑙, and 𝜕(𝑟 − 𝑗𝑓) reflect various model and data
to use in the field because of how easy it is to use, even
attributes that impact 𝜏 the learning process, whereas
for those without much technical training, which is great 𝑟
−ℎ𝑠+
for the agricultural industry. 𝑖 𝑓𝑟 denotes the model's performance metrics.
The whole procedure of using machine learning Simplifies difficult processes ∋ for enhanced plant
𝑟
technology to identify plant diseases is shown in Figure disease detection by balancing computing efficiency
𝑓𝑟
2. The first step is to get photos from field crops,
and accuracy.
specifically from leaves, using either cameras or sensors.
Contribution 3: Demonstrate Real-World Benefits
You may find these pictures in the Leaf Image Database.
and Versatility
Image pre-processing improves the quality of the photos
The AMLCF's flexibility in effectively identifying
by using cropping filters and separation algorithms,
many plant diseases is shown by testing it on diverse
which then isolate the leaf region. This phase is critical
agricultural datasets, showcasing its real-world
for precise disease identification because it zeroes in on
application. The framework has other uses beyond only
the leaf and eliminates extraneous background data. Data
diagnosing diseases; it may identify pests and monitor the
Splitting is the next step after pre-processing the photos;
condition of crops. The impressive results in many
it involves dividing the dataset into three parts: training,
settings highlight its ability to transform farming
validation, and testing. This guarantees accurate model
methods, leading to better yields and more efficient
validation and training as well as reliable performance
operations. The AMLCF's versatility makes it a useful
evaluation.
tool in many agricultural settings.
To train a machine learning model, one uses training
In Figure 3, machine learning for plant disease
and validation datasets. Here, the model is trained to
categorization is displayed. The Dataset has to be
recognize characteristics linked to different plant diseases
expanded as the first step so that the model can
in the training set, and its accuracy is adjusted with the
accommodate more data. To enhance the images further,
help of the validation set. The trained model's
Image Filtering is done on the dataset, removing noise
performance is evaluated using the Test Set. In this stage,
and other irrelevant information. Attributes Selection is
the model's accuracy in disease classification is tested
the next stage in illness classification that comes after
with fresh, previously unseen photos of plants. Lastly,
image filtering to determine which pixels in an image are
Plant Leaf Classification employs the performance-
important. This processing stage determines what leaf
assessed model, which enables precise disease
traits indicate what diseases; hence, it’s crucial for the
identification for efficient plant surveillance and early
model's reliability. After this, datasets are separated into
disease treatment. To improve the accuracy and
two parts, namely, training and test sets. Training data
efficiency of plant disease diagnoses, this AutoML
trains a machine learning model used by models, while
approach simplifies the workflow.
test data assesses its performance in terms of accuracy
𝜗1 𝜗2 𝜗1 𝜗2
𝜎3 ( + ) = 𝜎1 ( − )=0 and generalizability. This ensures that our model can
√1+𝛽 √1−𝛽 √1+𝛽 √1−𝛽
generalize well even on unknown samples.
(3) The above approach gives us a Machine Learning
This is equation 3, seen as a balanced model inside algorithm that trains on our training data and finds
(AMLCF), where various parameters (θ₁, θ₂) are features associated with various diseases of plants. The
standardized and changed according to a factor (β) to accuracy and efficiency of such a trained model are then
attempt to reach an ideal state. Machine learning relies on determined when applied to test data. A trained model is
this equilibrium to achieve high accuracy and efficiency; applied to predict three diseases: brown spots, leaf
it improves performance in agricultural applications like bacteria blight, and leaf moulds. Like any other machine-
plant disease detection by ensuring features are learning technique, this allows future detection of three
appropriately scaled and weighted. crop sicknesses: brown spots, leaf bacteria blight, and
𝑟
1 ∋ −ℎ𝑠+𝑓𝑟 leaf Molds.
𝑁(𝐹) = √ℎ − 𝑟𝑙/𝜕(𝑟 − 𝑗𝑓) = 𝜏 ∫0 𝑖
𝑔 ∑𝑠𝑘=1 𝛼𝑙 + 𝑑𝑙+𝑟 − 𝑠𝑦𝑡+𝑘 > 0, ∝𝑞 + 𝑀𝑅 (5)
(4) A representation of the classification model's
DOI: https://doi.org/10.52756/ijerr.2024.v46.014
182
Int. J. Exp. Res. Rev., Vol. 46: 177-190 (2024)
threshold-based decision-making 𝑘 = 1 process when variables 𝑎. 𝑒. 𝑥 and 𝜎 𝑖, guarantee the implementation of
Equation 5 is satisfied. The parameters or weights that customized solutions 𝑁 for different agricultural datasets.
have been learnt, 𝛼𝑙 and 𝑑𝑙+𝑟 , aspects or data elements The AMLCF, a computerized machine learning
from the dataset, 0, ∝𝑞 and 𝑠𝑦𝑡+𝑘 , computational classification framework, is shown in Figure 4. The data
efficiency analysis and a margin or regularization term, are from the Agriculture Crop Images Kaggle Dataset.
𝑀𝑅, are all presented this time. The Crop images dataset includes 40 or more images of
𝐺𝑖𝑅 (𝑢𝑅 (𝑥)) = 𝐺𝑖 (𝑢𝑅 (𝑥)) + 𝑎. 𝑒. 𝑥 ∈ 𝜏, 𝜎 𝑖 = every agricultural crop, including sugarcane, rice, jute,
1, … , 𝑁 (6) wheat, and maize. There are more than 160 enhanced
In Figure 7 above, the wide range of applications of makes these tactics more accessible to those who may not
modern machine learning algorithms in plant disease have a strong background in technology. By automating
identification demonstrates their versatility. Methods like formerly manual procedures, AMLCFs lessen the
CNNs prove their adaptability by detecting numerous demand for specialized skills and facilitate their
plant diseases in various environments and with different incorporation into real-world agricultural contexts. This
species. This proves that CNNs can discriminate between paves the way for their usage to spread, producing 99.1%.
various plant diseases with high accuracy. The availability of user-friendly support tools and
AMLCFs enhance this adaptability by making model interfaces is crucial to ensuring farmers and agricultural
modification easier and facilitating rapid adaptation to professionals can effectively utilize new technologies.
varied datasets and illness types. In addition to helping Although some success has been had, further work is
with disease identification, these frameworks could be required to make the design and support even more user-
useful for other agricultural tasks, including estimating friendly. Despite some success, this remains the case. By
crop yields and preventing pest infestations. Because of taking this route, we can make sure that innovative
the adaptability of modern machine learning algorithms, machine-learning methods are applied to farming in an
they may be used for a broad range of agricultural tasks, approachable and beneficial way.
allowing for comprehensive plant health monitoring, In Figure 9, advanced machine learning methods
which produces 98.3%. To ensure consistent performance greatly improve the capacity to monitor crop health by
across a wide number of applications and conditions, it is allowing for the rapid and precise diagnosis of diseases.
necessary to continually improve and validate it, even if it Methods like CNNs do an excellent job of analyzing
is adaptable. This is the sole method to guarantee that plant images, which can be used to detect disease signs in
there will be ongoing development. By offering complete plants at an early stage. This enables the proactive
solutions to numerous problems related to plant disease management of crop health. The method is further
control, these strategies may cause an unprecedented shift optimized with AMLCFs, and improved access to
in agricultural operations. The methods' broad adoption advanced disease diagnostics results from these
demonstrates their promise and highlights their ability to frameworks' efforts streamline model deployment and
revolutionize agricultural operations. reduce processing requirements. Farmers can get real-
In Figure 8, when applying modern machine learning time alerts and information about possible problems by
algorithms to the problem of plant disease diagnosis, it is incorporating these techniques into crop health
DOI: https://doi.org/10.52756/ijerr.2024.v46.014
186
Int. J. Exp. Res. Rev., Vol. 46: 177-190 (2024)
monitoring systems, which produce 94.8%. Because of inevitably be obstacles that must be overcome. When
this, farmers can now control and intervene with their considering all factors, crop health management benefits
crops at the perfect moment. Additionally, this substantially from applying advanced machine learning
preventative measure encourages better crop treatment algorithms. Table 1 shows the performance analysis.
decisions and resource allocation and improves the Despite these advancements, complications like data
accuracy of illness diagnosis. variability and computational resource demands persist.
DOI: https://doi.org/10.52756/ijerr.2024.v46.014
190