Crop and Yield Prediction Model
Crop and Yield Prediction Model
AbstractAn agricultural sector necessitate for well defined Formed containing data factors depending on center
and systematic approach for predicting the crops with its position. The cluster center is shown by + signs. The quality of
yield and supporting farmers to take correct decisions to clusters will depend on how dense it is. So, cluster having
enhance quality of farming. The complexity of predicting the more number of points is cluster of good quality.
best crops is high duet unavailability of crop knowledge-base.
Crop prediction is an efficient approach for better quality
farming and increase revenue. Use of data clustering
algorithm is an efficient approach in field of data mining to
extract useful information and give prediction. Various
approaches have been implemented so far are worked either
for crop prediction.Crop prediction model aiding farmers to
take correct decision. This indeed helps in improving quality
of farming and generate better revenue for farmers.
Traditional clustering algorithms such as k-Means, improved
rough k-Means and-means++ makes the tasks complicated
due to random selection of initial cluster center and decision
of number of clusters. Modified K-Means algorithm is
thereby used to improve the accuracy of a system as it
achieves the high quality clusters duet initial cluster centric Figure 1 Cluster Analysis
selection.
This paper proposes Bee Hive algorithm for
Keywords:-crops, quality farming, prediction, k-means, disease, predicting crop yield from historical data set. This algorithm
yield, temperature affect, water requirement, evapo-transpiration, handles large data set but it has drawback of having number of
plant. tunable parameters and k value
I INTRODUCTION II LITERATURE SURVEY
A crop prediction is a huge problem that occurs. A farmer had an CRY An improved Crop Yield Prediction model using Bee
attention in understanding how much produce he is going to Hive Clustering Approach for Agricultural data sets (2013)
expect. Traditionally farmers decide this based on permanent This paper proposes Bee Hive algorithm for
experience for specific yield, plants and weather conditions. predicting crop yield from historical data set. This algorithm
Character directly thinks about produce prediction rather than handles large data set but it has drawback of having number of
concerning on crop prediction.
tunable parameters and k value.
If the correct crop is expected then yield will be better.
An improved Rough K-means algorithm with weighted
Problem of crop and yield prediction using modified k-means
distance measure (2012):
clustering algorithm thereby creating better earnings for berry
This paper proposes a solution to search initial central
farmers. Clustering is the process of grouping the data into classes
points and combine it with a distance measure with weight. It
or groupings, so that objects within a cluster have high similarity
requires additional parameter such as density threshold and
in agreement to each other but are incredibly dissimilar to objects
number of cluster.
in option clusters.
k-means++: The advantages of careful seeding(2013):
A bunch of data objects can be treated collectively
This algorithm suggests K-Means++ clustering
during the time that you group and so may be looked at as a
algorithm by using randomized seeding technique. It has
classic of data compression. Unlike category, clustering is a
drawback of number of cluster value and decision of initial
powerful means for partitioning the collection of data into
center.
organizations based on data likeness and then ascribe labeling to
An EM clustering algorithm which produces a dual
the relatively small number of groups. Clustering is an
representation (2012):
unsupervised learning as it does not rely on predefined classes
This paper suggests an EM algorithm which handles
and class labeled training examples. Because of this, clustering is
real world data set but it randomly selects k-value and
a form of learning by observation, rather than learning by
becomes sensitive to noise and also highly complex in nature.
examples. Because shown in Figure. 1, the three clusters are
WWW.IJASRET.COM 23
|| Volume 1 ||Issue 1 ||April 2016||ISSN (Online) 2456-0774
INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH
AND ENGINEERING TRENDS
III MOTIVATION the number of clusters (k value) is required at starting for
traditional k-Means and k- Means ++, the same calculated
A crop prediction is a widespread problem that occurs. value of number of clusters is provided and initial cluster
During the rising season, a farmer had curiosity in knowing how centers are uniformly chosen.[1], [2]All three approaches
much yield he is about to expect. In the earlier period, this yield performed clustering and provide output in the form cluster
prediction become a matter of fact relied on number and centroid matrix.
Farmers long-term experience for specific yield, crops
and climatic conditions. Farmer directly goes for yield prediction Sample Testing and Prediction: There is need to provide input
parameters such as zone, district, and selection of seasons, soil
rather than concerning on crop prediction with the existing
type, maximum temperature, minimum temperature and
system. Unless the correct crop is predicted how the yield will be
average rainfall for sample testing. Based on the output values
better and additionally with existing systems pesticides, of each clustering, the test data calculates the distance measure
environmental and meteorological parameter related to crop is not with clustering output and selects minimum distance as a
considered. Promoting and soothing the agricultural production at predicted value. Then, the predicted cluster value is founded in
a more rapidly pace is one of the essential situation for output cluster number (idx) and as per the priority the very
agricultural improvement. Any crop's production show the way first output value of predicted cluster is selected. Then, the
either by interest of domain or enhancement in yield or both. In similar number of records of output value is founded in
India, the prospect of widening the district under any crop does expected value and accuracy in terms of its count value is
not exist except by re-establishing to increase cropping strength calculated. The accuracy count is shown by pie chart.
or crop replacement. So, variations in crop productivity continue
to trouble the area and generate rigorous distress. So, there is need VI SYSTEM ARCHITECTURE
to attempt good technique for crop prediction in order to
overcome existing problem.
A. Traditional approaches:
Experience based farming and agriculture
C. Efficiency issue:
CRY: It predicts the yield based on historical data. It
randomly selects centroid. It requires additional parameters
such as density and threshold
WWW.IJASRET.COM 24
|| Volume 1 ||Issue 1 ||April 2016||ISSN (Online) 2456-0774
INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH
AND ENGINEERING TRENDS
1. CALCULATION OF PDI:
VII EARLY BLIGHT DISEASE PREDICTION OF
TOMATO CROP PDI is percent disease index used to measure disease intensity.
1. LEAVES:
Initially small dark spots form on older foliage near the ground.
Leaf spots are round, brown and can grow up to half inch in
diameter.
Larger spots have target like concentric rings and tissue around
spots often turns yellow.
Severely infected leaves turn brown and fall off, or dead, dried
leaves may cling to the stem.
Fruit spots are leathery, black, with raised concentric ridges and
generally occur near the stem.
WWW.IJASRET.COM 25
|| Volume 1 ||Issue 1 ||April 2016||ISSN (Online) 2456-0774
INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH
AND ENGINEERING TRENDS
relevance between maximum temperature with PDI, affecting in grass surface that is roughly 8 to 15 centimeters in height.
disease development. Once ET0 is known, the water requirement of the crop can be
calculated.
CORR= [ NXY - (X)(Y) / Sqrt([NX2 - (X)2][NY2 -
(Y)2])] IX MODIFIEDK-MEANS CLUSTERING
If correlation of maximum temperature with PDI is positive, then The modified k-means algorithm is most well known
tomato crop is affected by the disease and if correlation is data clustering approach based on improvement of sensitivity
negative, then, crop is not affected by the disease. of initial centers(seed point) of clusters. This algorithm
partitions the whole space into different segments and
calculates the frequency of data point in each segment. The
segment which shows maximum frequency of data point will
IMPLEMENTATION REPORT: have the maximum probability to contain the centroid of
cluster. The steps are:
1. Input:-data set and value of k.
2. If the value of k is 1 then Exit.
3. Else
4. /*divide the data point space into k*k, means k vertically
and k horizontally*/
5. For each dimension
{
6. Calculate the minimum and maximum value of data points
7. Calculate range of group(RG) using equation
((min+max)/k)
8. Divide the data point space in k group with width RG
9. }
10. Calculate the frequency of data points in each partitioned
space.
Figure 5 Disease Prediction 11. Choose the k highest frequency group.
12. Calculate the mean of selected group. /* This will be the
VIII WATER REQUIREMENT OF TOMATO CROP initial centroid of clussster.*/
13. Calculate the distance between each clusters using
Water requirement is the most important factor for the equation (3)
healthy growth of crops. The amount of water potentially required 14. Take the minimum distance for each cluster and make it
to meet the evapo-transpiration needs of plant so that plant does half using equation (4)
not suffers in its growth through short supply of water. 15. For each data points p= 1 to N0
{
16. For each cluster j= 1 to k
{
17. Calculate d(ZP,MJ) using equation (1)
18. If (d(ZP,MJ) dcJ)
{
19. Then ZP assign to cluster CJ
20. Break
21
{
22. Else
23. Continue;
Figure 6 Water Requirements of Crops }
24. If ZP does not belong to any cluster then
The evapotranspiration rate is the amount of water that is 25. ZP min(d(ZP , Mi)) where i [1, Nc]
lost to the atmosphere through the leaves of the plant, as well as 26. }
the soil surface. Therefore, in order to estimate the water 27. Exit.
requirement of a crop we first need to measure the 28. else
evapotranspiration rate. The evapotranspiration rate, ET0, is the 29. Calculate the centroid of cluster using equation (2) of k-
estimate of the amount of water that is used by a well-watered means algorithm.
WWW.IJASRET.COM 26
|| Volume 1 ||Issue 1 ||April 2016||ISSN (Online) 2456-0774
INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH
AND ENGINEERING TRENDS
30. Go to step 13. that make them vary from each other. With these factors taken
into account, ET0 is converted into ETc, through the crop-
PENMAN-MONTEITH EQUATION: specific coefficient, Kc.ETc represents the evapotranspiration
rate of the crop under standard conditions (no stress
The reference rate, ET0, is calculated using the Penman Equation, conditions).When calculating ETc, one must identify the
which takes into account the climatic parameters of temperature, growth stages of the crop, their duration and select the proper
solar radiation, wind speed and humidity. Kc coefficient that need to be used.
A variation of this equation, published by the FAO is: ETc = Kc*ET0.
Where,
ETo reference evapotranspiration [mm day-1],
Rn net radiation at the crop surface [MJ m-2 day-1],
G soil heat flux density [MJ m-2 day-1],
T air temperature at 2 m height [C],
u2 wind speed at 2 m height [m s-1],
es saturation vapour pressure [kPa],
ea actual vapour pressure [kPa],
es - ea saturation vapour pressure deficit [kPa],
D slope vapour pressure curve [kPa C-1], Climatic effects are incorporated into ET0, while the
g psychometric constant [kPa C-1]. effects of the crop characteristics are incorporated into Kc.
Figure 7 crop stages of tomato and crop coefficients used for water
management
REFERENCES
WWW.IJASRET.COM 28