Crop Yield Prediction Using ML Algorithms
Crop Yield Prediction Using ML Algorithms
Authors:-
Pratima Chaudhury
Kalinga Institute of Industrial Technology
3rd Year Information Technology B.Tech
Abstract-
Most agricultural crops have been badly affected by the effect of global climate change in India. This project will allow
farmers to capture the yield of their crops before cultivation in the field of agriculture and thus help them make the
necessary decisions. It utilizes Random Forest which is a Machine Learning Algorithm. By researching such
problems and issues such as weather, temperature, humidity, rainfall, humidity, there are no adequate solutions and
inventions to resolve the situation we face. In countries like India, even in the agricultural sector, there are many
types of increasing economic growth. In addition, the processing is useful for forecasting the production of crop
yields.
Keywords— Machine Learning; Crop_yield_prediction; Random forest Algorithm;
1. Introduction
The Indus Valley Civilization Period is when India's the prediction, this will help farmers with predictions
agricultural history began. In this industry, India is ease their lifestyle a little bit and increase the yield
ranked second. Agriculture and allied sectors account and quality of their harvest. The practical
for 20.2% of GVA (gross value added) in fiscal year implementation of machine learning techniques and
2020-2021, which is 1.8% higher than the previous its quantification are the main topics of this study.In
fiscal year 2019-2020, and 18.8% with 42.6% of the order to obtain a consistent trend, the work presented
workforce in fiscal year 2021-2022. In terms of net here additionally takes into account the erratic data
cultivated area, India leads the world with 9.6% of all from the temperature and rainfall databases. Contrary
arable land, followed by the US (8.9%), China (8.8%), to the customary practice of making predictions about
and Russia (8.8%). According to demographics, crop yields by only taking into account one aspect at a
India's socio economic fabric is mostly based on time, this method takes into account all of the factors
agriculture. The GDP contribution of agriculture in The remainder of the paper is structured as follows.
India is significantly declining as industrialization Section 2 contains Literature Surveys of the
rises. Integration with technology is not at the desired researches that were done before this paper. Section
level, which is a problem for the Indian agricultural 3 contains Methodology that briefly describes the
sector. The reason why the agriculture sector's full different algorithms and the requirements for ML
potential is not being used. It is difficult for farmers to Algorithms. Section 4 contains the Proposed Model.
predict the rainfall and temperature, which has an Section 5 contains Brief Detail on Data Sources and
impact on the yield of crops, as a result of the overuse Datasets. Section 6 contains the Prediction Result
of industrial technologies and non-renewable that we get after using Formula. Section 7 Contains
resources. Here, machine learning can help farmers Result and Analysis that we get after processing the
by using algorithms like RNN, LSTM, and others to data in the Random Forest Model. Section 8 contains
predict trends in temperature, rainfall, and crop yield. Pros and Cons of the proposed model. Section 9
Due to the ability to pre-plan crops in accordance with
Contains Conclusion of the paper and Future use for the Proposed model.
2. Literature Survey
On a dataset from the Indian government, Leo Brieman [5] specializes in the random forest
experiments by Aruvansh Nigam, Saksham Garg, and algorithm's accuracy, strength, and correlation. The
Archit Agrawal[1] showed that the Random Forest random forest algorithm generates decision trees
machine learning method provides the best yield from different data samples, predicts the data from
forecast accuracy. each subset, and then provides the best answer for
Balamurugan [2], have implemented crop yield the system by voting.
prediction by using only the random forest classifier. Mishra [6] has theoretically described various
Various features like rainfall, temperature and season machine learning techniques that can be applied in
were taken into account to predict the crop yield. various forecasting areas.
According to Dr. Y. Jeevan Nagendra Kumar [3], Using data mining techniques, Shastry et al[7] fitted
supervised learning allows machine learning various regression models to forecast crop yield in
algorithms to forecast an objective or outcome. This India. The crop yields of maize, wheat, and cotton are
study focuses on supervised learning methods for studied using time series data, soil, and weather
predicting crop yields. parameters.
Jig Han et al. [4] used a random forest algorithm to Manjula's et al.[8] research aimed to propose and
predict global and regional crop yields for potato, implement a rule-based system to predict crop yield
maize, and wheat, as well as environmental variables production from past data by using association rule
such soil, climate, photoperiod, fertilization data, and mining on agricultural data from 2000 to 2012.
water. Here is the Table[2.1] showing data for different
survey for the crop yield prediction:
1. Saeed Khaki, Lizhi Wang and Sotirios Crop yield prediction CNN-RNN Model Its used to capture the time
V. Archontoulis dependencies of environmental factors
and the genetic improvement of seeds
over time without having their genotype
information.
2. Mayank Champaneri , Darpan Crop yield prediction Random forest Predicting the crop yield
Chachpara , Chaitanya Chandvidkar , Algorithm
Mansing Rathod
3. Thomas van Klompenburga , Ayalew To find the best Deep Learning The results show that no specific
Kassahuna , Cagatay Catalb, performing model, Methods conclusion can be drawn as to what the
models for Crop yield best model is, but they clearly show
prediction that some machine learning models are
used more than the others.
4. Mayank Champaneri, Crop yield prediction Random forest Based on the climatic input parameters
Chaitanya Chandvidkar, Algorithm the present study provided the
Darpan Chachpara , demonstration of the potential use of
Mansing Rathod data mining techniques in predicting
the crop yield based.
5. Ms. Ranjani J, Ms. V.K.G Kalaiselvi, Crop yield Random Forest The user-friendly web page built for
Ms. A.Sheela , prediction Algorithm estimating crop yield can be utilized by
Deepika Sree D, any user with their choice of the crop
Janaki G, by giving climate data for that location.
6. S. Vinson Joshua, A. Selwin Mich Crop yield prediction General Regression In this work, statistical models namely
Priyadharson, Raju Kannadasan, Arfat Neural Networks MLR and machine learning models
Ahmad Khan, Worawat Lawanont, (GRNN), such as BPNN, SVM, and GRNN
Faizan Ahmed Khan, Ateeq Ur Back Propagation models, are demonstrated for wide-
Rehman and Muhammad Junaid Ali Neural area spectrum considering the Indian
Network(BPNN), state of Tamil Nadu.GRNN model had
Support Vector a more significant potential to explain
Machine(SVM) 97% of variance from the input
parameters towards the crop yield;
offered higher prediction accuracy.
7. Lontsi Saadio Cedric, Wilfried Yves Crop yield prediction K-Nearest Neighbor In this paper, they proposed decision-
Hamilton Adoni, Rubby Aworka, support tools for decision-makers and
Jérémie Thouakesseh Zoueu, Franck farmers that predict six crop yields in
Kalala Mutombo, Moez Krichenf and some West African countries
Charles Lebon Mberi Kimpolo throughout the year, namely bananas,
yams, cassava, maize, rice, and seed
cotton.
2 Crops Arecanut, Other Kharif pulses, Rice, Banana, Cashewnut, Coconut, Dry ginger, Sugarcane, Sweet potato,
Tapioca, Black pepper, Dry chillies, other oilseeds, Turmeric, Maize, Moong(Green Gram), Urad, Arhar/Tur,
Groundnut, Sunflower, Bajra, Castor seed, Cotton(lint), Horse-gram, Jowar, Korra, Ragi, Tobacco, Gram,
Wheat, Masoor, Sesamum, Linseed, Safflower, Onion, other misc. Pulses, Samai, Small millets, Coriander,
Potato, Other Rabi pulses, Soyabean, Beans & Mutter(Vegetable), Bhindi, Brinjal, Citrus Fruit, Cucumber,
Grapes, Mango, Orange, other Fibers, Other Fresh Fruits, Other Vegetables, Papaya, Pome Fruits, Tomato,
Rapeseed & Mustard, Mesta, Cowpea(Lobia), Lemon, Pomegranate, Sapota, Cabbage, Peas, Niger seed,
Bottle Gourd, Sannhamp, Varagu, Garlic, Ginger, Oilseeds total, Pulses total, Jute, Peas & beans (Pulses),
Blackgram, Paddy, Pineapple, Barley, Khesari, Guar seed, Other Cereals & Millets, Cond-spcs other, Turnip,
Carrot, Redish, Arcanut (Processed), Atcanut (Raw),Cashew Nut Processed, Cashew Nut Raw, Cardamom,
Rubber, Bitter Gourd, Drum Stick, JackFruit, Snake Guard, Pump Kin, Tea, Coffee, Cauliflower, Other Citrus
Fruit, Water Melon, Total foodgrain, Kapas, Colocasia, Lentil, Bean, Jobster, Perilla, Rajmash Kholar,
Ricebean (nagadal), Ash Gourd, Beet Root, Lab-Lab, Ribbed Gourd, Yam, Apple, Peach, Pear, Plums,
Litchi, Ber, Other Dry Fruit, Jute & mesta
5. Proposed Model
The diagram of the proposed model shown above is of Random Forest Model and it works in several steps those are:
1. When the Algorithm is started the Data Sets are Loded in the model and Graphs are made according to
them in the 1st step and random samples are taken from the date sets that are then processed to get them
in suitable form to Construct Decision Trees.
2. When the Decision Trees are made they are made using Attribute selection Process and the attributes that
are selected are data points[subset] selected by the user and then the Decision Trees that are formed then
get the data and then the Decision Trees create some set of rules and formulas to predict the result each
tree uses different sets of data and form different rules for prediction.
3. The Result from each Decision Tree is taken and Voted upon By the random Forest Classifier and the result
that gets highest votes Gets selected for the Final Result.
4. The Final Result is Displayed and Graphs are made according to the result.
Pseudocode of the Proposed System in Fig[5.2]:
1. We first randomly select the 'k's to feature out of the total 'm' feature in the model
2. Using the best split point the k feature is chosen and node d is calculated.
3. Using the split method, split the nodes into daughter nodes.
4. Repeat steps 1 to 3 until several nodes have been reached.
5. To make an n number of trees, repeat steps 1 to 4 for an n number of times.
To perform prediction using the trained random forest algorithm uses the below
pseudocode as shown in Fig[4.2]:
1. We used the test features and each random decision tree to predict the output and
the outcome, which was then saved.
2. The vote given by each decision tree for each predicted event was then calculated.
3. Finally, we looked at the most popular predicted outcome, which is the random forest
10. References
[1] Aruvansh Nigam, Saksham Garg, and Archit Agrawal “Predict the best yield forecast accuracy”.
[2] Balamurugan, “Random Forests”, 2001
[3] Dr. Y. Jeevan Nagendra Kumar ”Predicting Yield of the Crop Using Machine Learning Algorithm”,2015
[4] Jig Han et al.,“Applications of machine learning techniques in agricultural crop production”,2016
[5] Leo Brieman.,Random forest
[6] Mishra.,“Random forest”
[7] Shastry et al., “Crop yield prediction”
[8] Manjula's et al.,“Crop yield prediction” .
[9] Saeed Khaki, Lizhi Wang and Sotirios V. Archontoulis.,”crop yield prediction”
[10]Mayank Champaneri , Darpan Chachpara ,
Chaitanya Chandvidkar , Mansing Ratho.,”crop yield prediction”.
[11]Mayank Champaneri , Darpan Chachpara , Chaitanya Chandvidkar , Mansing Rathod.,”crop yield prediction”.
[12]Thomas van Klompenburga , Ayalew Kassahuna , Cagatay Catalb,“Crop yield prediction”.
[13]Mayank Champaneri, Chaitanya Chandvidkar, Darpan Chachpara ,Mansing Rathod,”crop yield prediction”
[14] Ms. Ranjani J, Ms. V.K.G Kalaiselvi, Ms. A.Sheela ,Deepika Sree D,Janaki G,”crop yield prediction”
[15]S. Vinson Joshua, A. Selwin Mich Priyadharson, Raju Kannadasan, Arfat Ahmad Khan, Worawat Lawanont, Faizan,” crop yield
prediction”.
[16]Lontsi Saadio Cedric, Wilfried Yves Hamilton Adoni, Rubby Aworka, Jérémie Thouakesseh Zoueu,
“Crop yield prediction”.