Literature Review
Literature Review
Machine learning applications are having a great impact on the global economy by transforming
the data processing method and decision making. Agriculture is one of the fields where the
impact is significant, considering the global crisis for food supply. This research investigates the
potential benefits of integrating machine learning algorithms in modern agriculture. The main
focus of these algorithms is to help optimize crop production and reduce waste through informed
decisions regarding planting, watering, and harvesting crops. This paper includes a discussion on
the current state of machine learning in agriculture, highlighting key challenges and
opportunities, and presents experimental results that demonstrate the impact of changing labels
on the accuracy of data analysis algorithms. The findings recommend that by analyzing wide-
ranging data collected from farms, incorporating online IoT sensor data that were obtained in a
real-time manner, farmers can make more informed verdicts about factors that affect crop
growth. Eventually, integrating these technologies can transform modern agriculture by
increasing crop yields while minimizing waste. Fifteen different algorithms have been
considered to evaluate the most appropriate algorithms to use in agriculture, and a new feature
combination scheme-enhanced algorithm is presented. The results show that we can achieve a
classification accuracy of 99.59% using the Bayes Net algorithm and 99.46% using Naïve Bayes
Classifier and Hoeffding Tree algorithms. These results will indicate an increase in production
rates and reduce the effective cost for the farms, leading to more resilient infrastructure and
sustainable environments. Moreover, the findings we obtained in this study can also help future
farmers detect diseases early, increase crop production efficiency, and reduce prices when the
world is experiencing food shortages.
2. Using machine learning for crop yield prediction in the past or the future Alejandro
MoralesAlejandro Morales 1Francisco J. Villalobos,*Francisco J. Villalobos2,3- 2023.
The use of ML in agronomy has been increasing exponentially since the start of the century,
including data-driven predictions of crop yields from farm-level information on soil, climate and
management. However, little is known about the effect of data partitioning schemes on the actual
performance of the models, in special when they are built for yield forecast. In this study, we
explore the effect of the choice of predictive algorithm, amount of data, and data partitioning
strategies on predictive performance, using synthetic datasets from biophysical crop models. We
simulated sunflower and wheat data using OilcropSun and Ceres-Wheat from DSSAT for the
period 2001-2020 in 5 areas of Spain. Simulations were performed in farms differing in soil
depth and management. The data set of farm simulated yields was analyzed with different
algorithms (regularized linear models, random forest, artificial neural networks) as a function of
seasonal weather, management, and soil. The analysis was performed with Keras for neural
networks and R packages for all other algorithms. Data partitioning for training and testing was
performed with ordered data (i.e., older data for training, newest data for testing) in order to
compare the different algorithms in their ability to predict yields in the future by extrapolating
from past data. The Random Forest algorithm had a better performance (Root Mean Square Error
35-38%) than artificial neural networks (37-141%) and regularized linear models (64-65%) and
was easier to execute. However, even the best models showed a limited advantage over the
predictions of a sensible baseline (average yield of the farm in the training set) which showed
RMSE of 42%. Errors in seasonal weather forecasting were not taken into account, so real-world
performance is expected to be even closer to the baseline. Application of AI algorithms for yield
prediction should always include a comparison with the best guess to evaluate if the additional
cost of data required for the model compensates for the increase in predictive power. Random
partitioning of data for training and validation should be avoided in models for yield forecasting.
3. Crop prediction using machine learning Madhuri Shripathi Rao1 , Arushi Singh1 , N.V.
Subba Reddy1 and Dinesh U Acharya1-2021.
For most developing countries, agriculture is their primary source of revenue. Modern
agriculture is a constantly growing approach for agricultural advances and farming techniques. It
becomes challenging for the farmers to satisfy our planet's evolving requirements and the
expectations of merchants, customers, etc. Some of the challenges the farmers face are- (i)
Dealing with climatic changes because of soil erosion and industry emissions (ii) Nutrient
deficiency in the soil, caused by a shortage of crucial minerals such as potassium, nitrogen, and
phosphorus can result in reduced crop growth. (iii) Farmers make a mistake by cultivating the
same crops year after year without experimenting with different varieties. They add fertilizers
randomly without understanding the inferior quality or quantity. The paper aims to discover the
best model for crop prediction, which can help farmers decide the type of crop to grow based on
the climatic conditions and nutrients present in the soil. This paper compares popular algorithms
such as K-Nearest Neighbor (KNN), Decision Tree, and Random Forest Classifier using two
different criterions Gini and Entropy. Results reveal that Random Forest gives the highest
accuracy among the three.
Smart imaging devices have been used at a rapid rate in the agriculture sector for the last few
years. Fruit recognition and classification is noticed as one of the looming sectors in computer
vision and image classification. A fruit classification may be adopted in the fruit market for
consumers to determine the variety and grading of fruits. Fruit quality is a prerequisite property
from a health viewpoint. Classification systems described so far are not adequate for fruit
recognition and classification during accuracy and quantitative analysis. Deep learning models
have the ability to extract the potential image features without using handcrafted features. In this
paper, Type-II Fuzzy, TLBO (Teacher-learner based optimization), and deep learning
Convolution Neural Network (CNN), Recurrent Neural Network (RNN), and Long Short-Term
Memory (LSTM) applications proposed to enhance, segment, recognize and classify the fruit
images. Thus, the examination of new proposals for fruit recognition and classification is
worthwhile. In the present time, automatic fruit recognition and classification is though a
demanding task. Deep learning is a powerful state-of-the-art approach for image classification.
This task incorporates deep learning models: CNN, RNN, LSTM for classification of fruits based
on chosen optimal and derived features. As preliminary arises, it has been recognized that the
recommended procedure has effective accuracy and quantitative analysis results. Moreover, the
comparatively high computational momentum of the proposed scheme will promote in the
future.
Computer vision and image processing techniques are considered efficient tools for classifying
various types of fruits and vegetables. In this paper, automated fruit classification and detection
systems have been developed using deep learning algorithms. In this work, we used two datasets
of colored fruit images. The first FIDS-30 dataset of 971 images with 30 distinct classes of fruits
is publicly available. A major contribution of this work is to present a private dataset containing
761 images with eight categories of fruits, which have been collected and annotated by
ourselves. In our work, the YOLOv3 deep learning object detection algorithm have been used for
individual fruit detection across multiple classes, and ResNet50 and VGG16 techniques have
been utilized for the final classification for the recognition of a single category of fruit in images.
Next, we implemented the automatic fruit classification models with flask for the web
framework. We got 86% and 85% accuracies from the public dataset with ResNet50 and
VGG16, respectively. We achieved 99% accuracy with ResNet50 and 98% accuracy with the
VGG16 model on the custom dataset. The domain adaptation approach is used in this work so
that the proposed deep learning-based prediction model can cope with real-world problems of
diverse domains. Finally, an Android smartphone application has been developed to classify and
detect fruits with the camera in real-time. All the images uploaded from the Android device are
automatically sent and consequently analyzed on the web, and finally, the processed data and
results are returned to the smartphone. The custom dataset and implementation codes will be
available after the manuscript has been accepted.
6. Coupling machine learning and crop modeling improves crop yield prediction in the US
Corn Belt ,Mohsen Shahhosseini, Guiping Hu, Isaiah Huber & Sotirios V. Archontoulis -
2021.
This study investigates whether coupling crop modeling and machine learning (ML) improves
corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid
approach (crop modeling + ML) would result in better predictions, investigate which
combinations of hybrid models provide the most accurate predictions, and determine the features
from the crop modeling that are most effective to be integrated with ML for corn yield
prediction. Five ML models (linear regression, LASSO, LightGBM, random forest, and
XGBoost) and six ensemble models have been designed to address the research question. The
results suggest that adding simulation crop model variables (APSIM) as input features to ML
models can decrease yield prediction root mean squared error (RMSE) from 7 to 20%.
Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models
and we found soil moisture related APSIM variables are most influential on the ML predictions
followed by crop-related and phenology-related variables. Finally, based on feature importance
measure, it has been observed that simulated APSIM average drought stress and average water
table depth during the growing season are the most important APSIM inputs to ML. This result
indicates that weather information alone is not sufficient and ML models need more hydrological
inputs to make improved yield predictions.
7. Crop yield prediction using machine learning: A systematic literature review, Thomas
van Klompenburg, Ayalew Kassahun, Cagatay Catal-2021.
Machine learning is an important decision support tool for crop yield prediction, including
supporting decisions on what crops to grow and what to do during the growing season of the
crops. Several machine learning algorithms have been applied to support crop yield prediction
research. In this study, we performed a Systematic Literature Review (SLR) to extract and
synthesize the algorithms and features that have been used in crop yield prediction studies. Based
on our search criteria, we retrieved 567 relevant studies from six electronic databases, of which
we have selected 50 studies for further analysis using inclusion and exclusion criteria. We
investigated these selected studies carefully, analyzed the methods and features used, and
provided suggestions for further research. According to our analysis, the most used features are
temperature, rainfall, and soil type, and the most applied algorithm is Artificial Neural Networks
in these models. After this observation based on the analysis of machine learning-based 50
papers, we performed an additional search in electronic databases to identify deep learning-based
studies, reached 30 deep learning-based papers, and extracted the applied deep learning
algorithms. According to this additional analysis, Convolutional Neural Networks (CNN) is the
most widely used deep learning algorithm in these studies, and the other widely used deep
learning algorithms are Long-Short Term Memory (LSTM) and Deep Neural Networks (DNN).
8. An interaction regression model for crop yield prediction ,Javad Ansarifar, Lizhi Wang
& Sotirios V. Archontoulis-2021.
Crop yield prediction is crucial for global food security yet notoriously challenging due to
multitudinous factors that jointly determine the yield, including genotype, environment,
management, and their complex interactions. Integrating the power of optimization, machine
learning, and agronomic insight, we present a new predictive model (referred to as the interaction
regression model) for crop yield prediction, which has three salient properties. First, it achieved a
relative root mean square error of 8% or less in three Midwest states (Illinois, Indiana, and Iowa)
in the US for both corn and soybean yield prediction, outperforming state-of-the-art machine
learning algorithms. Second, it identified about a dozen environment by management
interactions for corn and soybean yield, some of which are consistent with conventional
agronomic knowledge whereas some others interactions require additional analysis or
experiment to prove or disprove. Third, it quantitatively dissected crop yield into contributions
from weather, soil, management, and their interactions, allowing agronomists to pinpoint the
factors that favorably or unfavorably affect the yield of a given location under a given weather
and management scenario. The most significant contribution of the new prediction model is its
capability to produce accurate prediction and explainable insights simultaneously. This was
achieved by training the algorithm to select features and interactions that are spatially and
temporally robust to balance prediction accuracy for the training data and generalizability to the
test data.
9. Predicting Agriculture Yields Based on Machine Learning Using Regression and Deep
Learning, Priyanka Sharma; Pankaj Dadheech; Nagender Aneja; Sandhya Aneja-2023.
Agriculture contributes a significant amount to the economy of India due to the dependence on
human beings for their survival. The main obstacle to food security is population expansion
leading to rising demand for food. Farmers must produce more on the same land to boost the
supply. Through crop yield prediction, technology can assist farmers in producing more. This
paper’s primary goal is to predict crop yield utilizing the variables of rainfall, crop,
meteorological conditions, area, production, and yield that have posed a serious threat to the
long-term viability of agriculture. Crop yield prediction is a decision-support tool that uses
machine learning and deep learning that can be used to make decisions about which crops to
produce and what to do in the crop’s growing season. It can decide which crops to produce and
what to do in the crop’s growing season. Regardless of the distracting environment, machine
learning and deep learning algorithms are utilized in crop selection to reduce agricultural yield
output losses. To estimate the agricultural yield, machine learning techniques: decision tree,
random forest, and XGBoost regression; deep learning techniques - convolutional neural network
and long-short term memory network have been used. Accuracy, root mean square error, mean
square error, mean absolute error, standard deviation, and losses are compared. Other machine
learning and deep learning methods fall short compared to the random forest and convolutional
neural network. The random forest has a maximum accuracy of 98.96%, mean absolute error of
1.97, root mean square error of 2.45, and standard deviation of 1.23. The convolutional neural
network has been evaluated with a minimum loss of 0.00060. Consequently, a model is
developed that, compared to other algorithms, predicts the yield quite well. The findings are then
analyzed using the root mean square error metric to understand better how the model’s errors
compare to those of the other methods.
10. Crop Yield Prediction based on Indian Agriculture using Machine Learning, Potnuru
Sai Nishant; Pinapa Sai Venkat; Bollu Lakshmi Avinash; B. Jabber-2020.
In India, we all know that Agriculture is the backbone of the country. This paper predicts the
yield of almost all kinds of crops that are planted in India. This script makes novel by the usage
of simple parameters like State, district, season, area and the user can predict the yield of the
crop in which year he or she wants to. The paper uses advanced regression techniques like
Kernel Ridge, Lasso and ENet algorithms to predict the yield and uses the concept of Stacking
Regression for enhancing the algorithms to give a better prediction.
PROBLEM STATEMENT:
Agriculture is the pillar of the Indian economy and more than 50% of India's population are
dependent on agriculture for their survival. Variations in weather, climate, and other such
environmental conditions have become a major risk for the healthy existence of agriculture. They
are used in crop selection to reduce crop yield output losses, regardless of the distracting
environment. Weather, climate, and other related environmental elements have posed a
significant danger to agriculture's long-term viability. Agriculture contributes a significant
amount to the economy of India due to the dependence on human beings for their survival. The
main obstacle to food security is population expansion leading to rising demand for food.
Farmers must produce more on the same land to boost the supply. Through crop yield prediction,
technology can assist farmers in producing more. Machine learning (ML) plays a significant role
as it has decision support tool for Crop Yield Prediction (CYP) including supporting decisions on
what crops to grow and what to do during the growing season of the crops. Machine learning
(ML) is significant since it offers a decision-support tool for Crop Yield Prediction (CYP), which
may help with decisions like which crops to cultivate and what to do during the crop's growing
season. The Project deals with a systematic review that extracts and synthesizes the features used
for CYP and furthermore, there are a variety of methods that were developed to analyze crop
yield prediction using artificial intelligence techniques. The major limitations of the Neural
Network are reduction in the relative error and decreased prediction efficiency of Crop Yield.
Crop yield estimation's major purpose is to boost agricultural crop production, and it does so
using a variety of well-established models. Machine learning is increasingly widely used around
the world due to its success in a range of disciplines such as forecasting, fault detection, pattern
identification, and so on. A key agricultural concern is a yield prediction. Farmers will be able to
determine the yield of their crop before growing on the agricultural field using the results of this
study, allowing them to make informed decisions.
Existing System:
Crop yield prediction is one of the challenging problems in precision agriculture, and
many models have been not validated so far.
This problem requires the use of several datasets since crop yield depends on many
different factors such as climate, weather, soil, use of fertilizer, and seed variety. This indicates
that crop yield prediction is not a trivial task; instead, it consists of several complicated steps.
Nowadays, crop yield prediction models can estimate the actual yield reasonably, but a
better performance in yield prediction is still desirable.
This crop yield issue requires the utilization of a few datasets on the grounds that harvest
yield is affected by numerous conditions like soil, climate, environment, compost use, and seed
assortment.
This demonstrates that estimating crop yields are not a clear interaction, yet rather a
succession of complex advances. Yield expectation models can now effortlessly figure the real
yield; however, better yield forecast yield is as yet wanted.
To make future predictions, regression approaches are being used, while descriptive models are
being used to gain insight from collected data and get high accuracy.
To make it simple and which can be directly used by the farmer this project
uses simple factors like which state and district is the farmer from, which
crop, crop year and in what season.
The system prepared predict crops yield of almost all kinds of crops that are
planted in India. This script makes novel by the usage of simple parameters
like State, district, season, area and the user can predict the yield of the crop in
which year he or she wants to.
The client has to enter the district, season and year. After submitting the
inputs, it will output the best crop to be planted.