Crop
Crop
Crop
Published By:
Retrieval Number F8640088619/2019©BEIESP Blue Eyes Intelligence Engineering
4082 & Sciences Publication
DOI: 10.35940/ijeat.F8640.088619
Evaluation of Machine Learning Algorithms for Crop Yield Prediction
Then, the experimental results on agricultural data are The dataset is processed through WEKA tool to build the set
discussed in Section IV. Finally, conclusion is given in of rules on the current dataset. The results were generated in
Section V. python by using SVM algorithm. In [15] based on the C4.5
algorithm, decision tree and decision rules have been
II. RELATED WORK developed, in their study they have developed a website called
Crop Advisor: This an interactive website for discovering the
Machine learning in Agriculture is a Novel field, a great deal
affect of weather and crop production by using C4.5
of work has been done in field of Agriculture utilizing
algorithm[15]. This gives the idea of how different climatic
Machine learning. There are diverse guaging philosophies
parameters impact the growth of the crop. The selections were
created and assessed by the specialists everywhere throughout
made based on the area under the chosen crop. The
the world in the field of farming or related sciences.
information regarding the associated years climatic
Agricultural scientists in Pakistan have demonstrated that
parameters like rainfall, high and low temperature, wet day
endeavors of harvest yield amplification through expert
frequency where collected. The id3 algorithm [16] were
pesticide state strategies have prompted a hazardously high
developed to get good quality and improved Tomato crop
pesticide use. These examinations have revealed negative
yield which is implemented in PHP platform and uses csv as
relationship between's pesticide use and harvest yield [5]. In
data sets. The features used in this study include area,
their investigation they have explained that how data mining
production of tomato crop, temperature and humidity.
incorporated farming information including irritation
exploring, pesticide utilization and meteorological
III. PROPOSED SYSTEM
information are helpful for streamlining of pesticide use.
Topical data identified with agribusiness which has spatial In the proposed system, we use supervised learning to form a
properties was accounted for in one of the study[6]. model, which provides predicted cost of crop yield and
Their research went for perceiving patterns in corresponding production order. The proposed system is
farming creation with references to the accessibility described in following stages such as dataset collection,
of information assets. K-means method turned preprocessing step, feature selection and applying machine
into applied to carry out gauges of the contamination in learning modules as shown in figure 1.
the air[7],the k- nearest neighbor become connected for
mimicking day by day precipitations A. Dataset Collection: Data is collected from a variety of
and other climate elements [8],and numerous ability changes sources and prepared for data sets. And this data is used for
of the weather situations are descriptive analysis. Data is available from several online
dissected utilizing SVM[9]. Statistics mining techniques are abstract sources such as Kaggle.com and data.gov.in. We will
often used to have a look at soil qualities. As example, the k- use an annual summary of crops for at least 10 years. The data
means method is used for segmenting soils in mixture with sets used in this paper are soil dataset, rainfall dataset and
GPS-based technology [10]. A decision tree classifier for crop yield data.
agriculture information turned into proposed [11].This new
classifier uses new facts expression and can address each B. Preprocessing step: This step is a very important step in
entire records and in entire records. Inside the test,10-fold machine learning. Preprocessing consists of inserting the
cross validation technique is used to check the dataset, missing values, the appropriate data range, and extracting the
horse-colic dataset and soybean dataset. Their results showed functionality. The kind of the dataset is critical to the analysis
the proposed selection tree is capable of classifying all styles process. In this paper we have used isnull() method for
of agriculture records. A yield prediction version turned into checking null values and lable Encoder() for converting the
proposed in one of the take a look at [12] which makes use of categorical data into numerical data.
data mining techniques for category and prediction. This
model worked on enter parameters crop name, land location, C. Feature Selection: Feature extraction should simplify the
soil type, soil ph, pest information, climate, water stage, seed amount of data involved to represent a large data set. The soil
type and this model anticipated the plant boom and plant and crop characteristics extracted from the pre-treatment
diseases and therefore enabled to select the nice crop based on phase constitute the final set of training. These characteristics
climate information and required parameters. include the physical and chemical properties of the soil. Here,
There are few research works about sugarcane yield we have used Random Forest Classifier() method for feature
prediction which can be associated with our work. Sugarcane selection. This method selects the features based on the
yield prediction technique with use of Random forest [13] entropy value i.e., the attribute which is having more entropy
became proposed in one of the survey, the features used in this value is selected as important feature for yield prediction.
study consist of biomass index, climate statistics (e.g.,
rainfall) and yields from previous years. Two predictive tasks
are provided in [13]: (i) the category problem for predicting
whether or not the yield can be above or underneath the found
median yield, and (ii) the regression hassle for predicting the
yield estimates in two distinct time intervals. In addition,
support vector system[14] for rice crop yield prediction
become proposed, the dataset used in this method are
precipitation ,minimum, maximum and common temperature,
place, evapotranspiration and manufacturing. The sequential
minimal optimization classifier is implemented on the dataset.
Published By:
Retrieval Number F8640088619/2019©BEIESP Blue Eyes Intelligence Engineering
4083 & Sciences Publication
DOI: 10.35940/ijeat.F8640.088619
International Journal of Engineering and Advanced Technology (IJEAT)
ISSN: 2249 – 8958, Volume-8 Issue-6, August 2019
Flow chart:
Published By:
Retrieval Number F8640088619/2019©BEIESP Blue Eyes Intelligence Engineering
4084 & Sciences Publication
DOI: 10.35940/ijeat.F8640.088619
Evaluation of Machine Learning Algorithms for Crop Yield Prediction
iii. Decision tree different parameters set for these algorithms were mean
Decision tree is a predictive model which works by checking squared error, accuracy and cross validation which are used to
condition at every level of the tree and proceeds towards estimate the efficiency of the method as shown in the table 1.
bottom of the tree where various decisions are listed. The
condition depends on the application and the outcome might The formula for calculating Accuracy and Mean Squared
be in terms of decision. There are various types of Decision Error(MSE) are:
tree algorithms such as C4.5,CART and ID3 algorithm. In this
paper we are using ID3 algorithm for model building. The
factors like Rainfall, soil type and crop yield can be used as
input to Prediction algorithm and the resultant will be in terms
of the prediction of productivity.
The ID3 algorithm [15] starts with the original set S as the
root node. Each iteration of the algorithm finds each unused
attribute in set S and computes the entropy H (S) (or
information gain IG (S)) of this attribute. Then select the Table 1.Comparision table for different parameters
attribute with the smallest entropy value (or the largest
information gain). The set S is then divided into the selected Algorithms Accuracy MSE
attributes. The algorithm is continually reproduced in each Decision tree 99 0.01
subset, considering only those attributes that were not
KNN 58.078 0.603049
previously selected. The figure 4 shows the flow diagram of
SVM_rbf 58.952 0.559123
the ID3 algorithm.
SVM_linear 53.712 6.209897
Published By:
Retrieval Number F8640088619/2019©BEIESP Blue Eyes Intelligence Engineering
4085 & Sciences Publication
DOI: 10.35940/ijeat.F8640.088619
International Journal of Engineering and Advanced Technology (IJEAT)
ISSN: 2249 – 8958, Volume-8 Issue-6, August 2019
REFERENCES
1. Arun Kumar, Naveen Kumar, Vishal Vats.”Efficient crop yield
prediction using machine learning algorithms”, IJRET Volume: 05
Issue: 06, June-2018, pp 3151-3159
2. P.Priya, U.Muthaiah ,M.Balamurugan. “ Predicting yield of the crop
using machine learning Algorithm”, IJESRT et al., 7(4): April-2018,
pp 2277-2284
Published By:
Retrieval Number F8640088619/2019©BEIESP Blue Eyes Intelligence Engineering
4086 & Sciences Publication
DOI: 10.35940/ijeat.F8640.088619