Paper 7
Paper 7
Paper 7
Abstract - Agriculture plays a predominant role in the economic India is one of the established nations that has agriculture as
growth and development of the country. The major and serious its primary source of income. Agriculture is one such domain
setback in the crop productivity is that the farmers do not that contributes only around 14% to the GDP but has a
choose the right crop for cultivation. In order to improve the considerate amount of impact on the Indian economy. The
crop productivity, a crop recommendation system is to be
conventional agricultural practices and techniques are posing
developed that uses the ensembling technique of machine
learning. The ensembling technique is used to build a model that a lot of issues in terms of efficiency, cost-effectiveness and
combines the predictions of multiple machine learning models resource utilization. There is a necessity of better techniques
together to recommend the right crop based on the soil specific that can improve the standard of living of the farmers too.
type and characteristics with high accuracy. The independent Over the years due to globalization, agriculture has evolved
base learners used in the ensemble model are Random Forest, by adapting the latest technologies and techniques for a
Naive Bayes, and Linear SVM. Each classifier provides its own better standard of living. Among the technologies and
set of class labels with an acceptable accuracy. The class labels techniques , precision agriculture is one budding technology
of individual base learners are combined using the majority in the field of agriculture. Precision agriculture mainly
voting technique. The crop recommendation system classifies the
focuses on site-specific farming [1]. Crop recommendation is
input soil dataset into the recommendable crop type, Kharif and
Rabi. The dataset comprises of the soil specific physical and a prima arena in precision agriculture. Crop recommendation
chemical characteristics in addition to the climatic conditions relies on multiple parameters, for which precision agriculture
such as average rainfall and the surface temperature samples. practices help in identifying the parameters thereby
The average classification accuracy obtained by combining the facilitating better crop selection.
independent base learners is 99.91%.
Agricultural domain has imbibed the machine-learning
Keywords - Ensemble, Majority-Voting, Naive-Bayes, soil, crop- algorithm to produce efficient, cost-effective solutions to the
recommendation difficulties faced by the farmers. Researchers can utilize PC
simulations to lead early tests to assess how an assortment
I. OBJECTIVES may perform when looked with changed sub-atmosphere, soil
composes, climate designs, and different variables.
• Design a recommendation system for accurate crop Researchers in present day agribusiness are trying their
selection based on the various soil, rainfall and speculations at a more prominent scale and making
surface temperature parameters. considerably more precise, ongoing forecasts..
• To improve crop productivity by providing
predictions of high accuracy and efficiency through III. LITERATURE SURVEY
the ensembling technique
• To reduce the wrong choice on a crop by application The paper [2] mainly throws light on the implementation of a
of principles of precision agriculture. crop prediction system based on sensor networks that has
been developed using IoT. Soil testing labs take a
considerable amount of time in providing the results of the
II. INTRODUCTION submitted soil samples. Hence the system claims that it helps
the farmers to get a better crop prediction without any delay experimented on very less data mining algorithms. There is a
in the waiting period [2]. In the paper, the authors have necessity to include multiple data mining algorithms. The
mainly focused on analyzing the N(Nitrogen), P(Phosphorus), dataset size used is very small as stated by the authors
K(Potassium) contents in the soil sample collected for themselves, as huge datasets invite large amount of
survey. The proposed method in the paper efficiently complexities.
estimates the soil nutrients based on the data fetched by the
sensor network. This enables in predicting the apt crop for The paper provides clear details regarding the new framework
that soil under test. The farmers need to enlist their NPK designed by the authors namely eXtensible Crop Yield
sensor with the fundamental server. The NPK extract the Prediction Framework (XCYPF)[5]. This framework is
supplement level from the soil sample and refresh this mainly designed for predicting the crop yield. The framework
information to the primary server through the raspberry pi claims to provide crop selection, selection of dependent and
unit. In view of the readings got from the calculation makes independent variables ,datasets required for crop yield
predictions on the basis of the recorded information. The forecast. The framework has been tested for rice and
major shortfalls in this implementation were the inefficiency sugarcane crops. The framework professes to have been
of the crop prediction algorithm and major focus on the data coordinated with a management information system
collection through NPK sensor which has high range of providing expertise in the field of precision agriculture. The
fluctuations. framework that has been designed in the paper works
exclusively for rice and sugarcane crops.
The paper [3] presents a vivid representation of a Crop
Selection Method which aims to solve the crop selection issue A special concern has always been shown in case of how to
and enhances the net yield of the harvest[3]. The authors have increase the productivity of the crops. There have been
proposed a strategy that proposes a scope of crops to be various methods designed and other improvised techniques
chosen over a season by keeping into thought the essential that are used to boost the yield of the crops. Solving the
elements like the climate, soil composition, water density, problem at the source immediately eradicates the issue.
crop category. The estimated value of the factors that are Hence deciding the perfect solution for the crop to be
highly influential determine the precision of Crop Selection cultivated will lead to better crop productivity, and in turn
Method. The technique taken into account in the paper is the boosting the economy of the country.
method of crop sequencing. A categorization of the crops is
done in four divisions namely seasonal, whole year, short-
time plantation, and long-time. The grouping of the crops IV. METHODOLOGY
from each category is selected in a sequence for the crop
cultivation. Hence there is a necessity for a prediction A brief step by step procedure of designing the crop
technique with upgraded precision and performance. In recommendation system is explained as follows:
addition to this, there is a compulsion of selecting atleast one
crop from the category which serves as a major setback. Step 1: Input
The paper presents a comprehensive analysis of the soil The input dataset is a comma separated values file containing
characteristics and behaviour, and using this the authors have the soil dataset, which has to be subjected to preprocessing.
introduced a method of foreseeing the crop utilization using
the information mining approach [4]. The paper highlights the Step 2: Preprocessing of input data
focus on improvising the yield of the crops rather than the
crop selection technique. The soil datasets are taken as inputs Input dataset is subject to various preprocessing techniques
and analyzed. Based on the thorough analysis, the soil is such as filling of missing values, encoding of categorical data
arranged into low, medium, and high classes by making best and scaling of values in the appropriate range
use of the procedures that are used in data mining. As a result
of such categorization, the crop yield has been predicted Step 3: Splitting into training and testing dataset
using Naive-Bayes and k-Nearest algorithm. Here the crop
yield anticipated is formalized as a classification rule. The preprocessed dataset is then split into training and testing
Although there are many positive highlights with respect to dataset based on the specified split ratio. The split ratio
improvising the crop yield, there are a few drawbacks. The considered in the proposed work is 75:25, which means 75%
major drawback is that problem eradication at the source is of the dataset is used for the training the ensemble model and
not focussed, rather more focus is only on increasing the crop the rest 25% is used as test dataset.
yield. The further setbacks are that the method has been
The training dataset is fed to each of the independent base M => Classifier , I => Inducer, S => Training set
learners and the individual classifiers are built using the
training dataset.
c) Diversity generator – Generation of diverse classifiers.
Step 5: Testing the data on each of the classifiers
d) Combiner – Responsible for combining the class labels
The testing dataset is applied on each of the classifiers, and obtained from the individual classifiers [21].
the individual class labels are obtained.
A. Ensemble Framework
created by utilizing the training set. The main advantages of where yk (x) is the classification of the kth classifier and g(y,
considering the Random Forest algorithm is that it provides c) is an indicator function defined as:
better accuracy, vigorous to the outliers, quicker than bagging
and boosting, basic and easy to parallelized.
C. Naive Bayes
Voting Technique has been used as the combination method [7] D Ramesh , B Vishnu Vardhan, “Data mining
to provide the best accuracy. technique and applications to agriculture yield
data”, International Journal of Advanced Research
The accuracy obtained using the ensembling technique is in Computer and Communication Engineering
99.91%. Hence, the proposed work provides a helping hand Vol. 2, Issue 9, September2013 .
to the farmer in the accurate selection of the crop for
cultivation. This creates an exponential gain in the crop [8] Satish Babu (2013), ‘A Software Model for
productivity which in turn boosts the economy of the country. Precision Agriculture for Small and Marginal
Farmers’, at the International Centre forFree and
Open Source Software (ICFOSS) Trivandrum,
REFERENCES India.
[5] Aakunuri Manjula, Dr.G .Narsimha (2015), [14] M.C.S.Geetha,” Implementation of Association
“XCYPF: A Flexible and Extensible Rule Mining for different soil types in
Framework for Agricultural Crop Yield Agriculture”, International Journal of Advanced
Prediction”, Conference on Intelligent Research in Computer and Communication
Systems and Control (ISCO) Engineering Vol. 4, Issue 4, April 2015.
[6] Anshal Savla, Parul Dhawan, Himtanaya [15] T.R.Lekha, “Efficient Crop Yield and Pesticide
Bhadada, Nivedita Israni, Alisha Mandholia , Prediction for Improving Agricultural Economy
Sanya Bhardwaj (2015), ‘Survey of classification using Data Mining Techniques”, International
algorithms for formulating yield prediction Journal of Modern Trends in Engineering and
accuracy in precision agriculture', Innovations in Science, Vol-03,Issue-10, 2016
Information, Embedded and Communication
systems (ICIIECS). [16] Shweta Taneja, Rashmi Arora, Savneet Kaur,
“Mining of Soil Data Using Unsupervised
Learning Technique”, International Journal of