0% found this document useful (0 votes)
4 views36 pages

Batch 20.project - Report

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views36 pages

Batch 20.project - Report

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Crop Price Prediction Using Machine Learning

A project report submitted to


Jawaharlal Nehru Technological University Kakinada, in the partial
Fulfillment for the Award of Degree of

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING

Submitted by
21491A05R4: Kari Harika
21491A05S0: D. Vanaja
21491A05S1: Thupakula Aishwarya
21491A05S6: J. Anitha
21491A05T9: Ronda Vijayakumar Reddy
21491A05w4: Devisetty Pujitha Sai
Under the esteemed guidance of
Dr. Muthamizh Selvam
Associate Professor

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

QIS COLLEGE OF ENGINEERING AND TECHNOLOGY


(AUTONOMOUS)
An ISO 9001:2015 Certified institution, approved by AICTE & Reaccredited by NBA, NAAC ‘A+’ Grade
(Affiliated to Jawaharlal Nehru Technological University, Kakinada)
VENGAMUKKAPALEM, ONGOLE – 523 272, A.P
April, 2024

1 | Page
QIS COLLEGE OF ENGINEERING AND TECHNOLOGY
(AUTONOMOUS)
An ISO 9001:2015 Certified institution, approved by AICTE & Reaccredited by NBA, NAAC ‘A+’ Grade
(Affiliated to Jawaharlal Nehru Technological University, Kakinada)
VENGAMUKKAPALEM, ONGOLE: -523272, A.P
December 2024

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


CERTIFICATE

This is to certify that the technical report entitled “Crop Price Prediction Using
Machine Learning is a bonafide work of the following final B Tech students in the partial
fulfillment of the requirement for the award of the degree of bachelor of technology in
COMPUTER SCIENCE AND ENGINEERING for the academic year 2024-2025.

21491A05R4: Kari Harika


21491A05S0: D. Vanaja
21491A05S1: Thupakula Aishwarya
21491A05S6: J. Anitha
21491A05T9: Ronda Vijayakumar Reddy
21491A05w4: Devisetty Pujitha Sai
Signature of the guide Signature of Head of Department
Dr.M.Selvam M.Tech, Ph.D Dr.D.Bujji Babu M.Tech, Ph.D
Associate Professor HOD, Professor in CSE

Signature of External Examiner

2
ACKNOWLEDGEMENT

We, the team behind the crop price prediction project, would like to express our sincere
gratitude to everyone who provided support and guidance throughout the development of this
project.

We express our gratitude to the Hon’ble chairman Sri. N. SURYA KALYAN


CHAKRAVARTHY GARU, QIS Group of Institutions, Ongole for his valuable suggestions
and advices in the B. Tech course. We express our gratitude to Dr.Y.V. HANUMANTHU
RAO, Ph.D., Principal of QIS College of Engineering & Technology, Ongole for his valuable
suggestions and advices in the B. Tech course.

We express our gratitude to the Head of the Department of CSDS, Dr.G. Lakshmi Vara
Prasad, M.Tech,Ph.D., QIS College of Engineering &Technology, Ongole for his constant
supervision, guidance and co-operation throughout the project. We would like to express our
thankfulness to our project guide Dr.M. Selvam, M.Tech,Ph.D., Assistant Professor - AIML,
QIS College of Engineering &Technology, Ongole for his constant motivation and valuable help
throughout the project work.

Finally, we would like to thank our Parents, Family and friends for their co-operation to
complete this project

Submitted by
21491A05R4: Kari Harika
21491A05S0: D. Vanaja
21491A05S1: Thupakula Aishwarya
21491A05S6: J. Anitha
21491A05T9: Ronda Vijayakumar Reddy
21491A05w4: Devisetty Pujitha Sai

3
ABSTRACT

Our project introduces a machine learning-based system designed to predict future crop
prices, leveraging inputs such as crop type, date, and historical prices. The model aims to address
challenges faced by the agricultural sector due to unpredictable price fluctuations, which impact
farmers' financial stability and market decision-making. By implementing Random Forest
Regression, this system can analyze non-linear patterns within large datasets, providing precise
price predictions that assist farmers, sellers, and buyers in planning their activities effectively.

The user-friendly, web-based interface allows clients to access real-time prices and view
future estimates for various crops. With daily updates, the model ensures accuracy and
reliability, reflecting current trends and supporting informed decisions. This predictive system
highlights the potential of machine learning to transform agricultural practices by delivering
insights that help mitigate risk, optimize resource allocation, and promote sustainable farming.

This project also demonstrates the feasibility of using machine learning in crop price
prediction, proving especially valuable in a sector highly influenced by external factors such as
climate and market demand. Future developments could expand the system’s capabilities by
incorporating additional data sources, like real-time weather or market demand data, further
enhancing prediction accuracy and practical relevance for the agricultural community.

4
TABLE OF CONTENTS
Chapter No Title Page No

ABSTRACT 4
LIST OF TABLES 7
LIST OF FIGURES 8
LIST OF SYMBOLS AND ABBREVATIONS 9

1 INTRODUCTION 10-11
1.1 History 10
1.2 Motivation 10
1.3 Goals 11
2 LITERATURE SURVEY 12-14
3 SYSTEM ANALYSIS 15-19
3.1 Existing System 15
3.2 Proposed System 16
3.2.1 The Data Collection Module
3.2.2 Data Preprocessing Module:
3.2.3 Machine Learning Model
3.2.4 Database Management System
3.2.5 Prediction Module
3.2.6 Web-based User Interface (UI)
3.3 Software Specification 18
3.3.1 Frontend
3.3.2 Backend
3.3.3 Database
3.3.4 Machine Learning
3.3.5 Deployment
4 ML TECHNIQUES 20-21
4.1 Linear Regression

5
4.2 Decision Trees and Random Forests
4.3 Support Vector Machines (SVMs)
4.4 Time-Series Models (ARIMA, LSTM)
5 SYSTEM DESIGN 22-23

5.1 System Architecture

5.2 Use Case Diagram


6 SOURCE CODE 24-27
6.1 Import Libraries
6.2 Load the Data
6.3 Handle Missing Values
6.4 Encode Categorical Features
6.5 Prepare the Data
6.6 Convert 'Date' to a Numerical Format
6.7 Split the Data
6.8 Create and Train the Random Forest Model
6.9 Make Predictions
6.10 Evaluate the Model

7 RESULTS AND DISCUSSION 28-32


7.1 Result
7.2 Discussion
8 CONCLUSION 33
9 FUTURE SCOPE 34
10 REFERENCES 35-36

6
LIST OF TABLES

TABLE TITLE PAGE


NO. NO.
1 Recent Review papers related to crop price prediction using 12
machine learning

2 Comparision between various models used and accuracy 13

3 Comparision on R-square, MAE, MSE of different models 30

LIST OF FIGURES
7
FIGURE TITLE PAGE
NO. NO.
1 comparison results of accuracy 14

2 General process involved in crop price prediction using 15


machine learning
3 ideology for the crop price prediction 16

4 Architectural diagram for the crop price prediction 20

5 general process in the crop price prediction model in use 23


case format
6 Accuracy comparison between Random Forest 28
Regression
and decision Tree Regression
7 Comparision between actual and predicted values using 29
Decision Tree regression
8 Welcome to Crop Price Predictor – Predict and 30
Track Vegetable Prices
9 Make a Crop Price Prediction 31

10 Crop Price Prediction Results 31

11 Daily Crop Price Updates - Crop Price Predictor 32

LIST OF SYMBOLS AND ABBREVIATIONS

8
ARIMA Autoregressive Integrated Moving Average
LSTM Long Short-Term Memory
3D-CNN 3-Dimensional Convolutional Neural Network
RMSE Root Mean Square Error
CNN Convolutional Neural Network
SVM Support Vector Machines
LR Linear Regression
DT Decision Tree
RF Random Forest
MAE Mean Absolute Error
MSE Mean Square Error
ML Machine Learning
GDP Gross Domestic Product
WPI Wholesale Price Index
CME Chicago Mercantile Exchange
CSV Comma-Separated Values
AWS Amazon Web Services

9
1. Introduction

Agriculture is fundamental to economies around the world, yet farmers often face
significant financial instability due to unpredictable crop price fluctuations. These price
variations are influenced by a range of factors including seasonality, weather conditions, and
supply chain disruptions. Such volatility directly impacts not only farmers’ income but also
market stability and food security. Therefore, reliable crop price forecasting is essential to
support better decision-making among farmers, traders, and policymakers.
Traditional methods of price forecasting, which rely on historical data and expert
assessments, struggle to capture the full complexity and dynamic nature of the factors affecting
crop prices. However, recent advancements in machine learning (ML) provide new opportunities
for accurate, data-driven price prediction. ML algorithms can analyze vast amounts of historical
data, detect patterns, and make precise forecasts even with complex, multi-factor dependencies.
By harnessing the predictive power of ML, we can offer agricultural stake holders tools to make
proactive, well-informed decisions.
1.1 History
Agriculture Plays a Vital Role in Many Economies Around the World, With Farmers Being
Crucial Stakeholders in This Field. Unfortunately, They Often Face Significant Financial Losses
Due to Price Fluctuations After Harvests. These Price Fluctuations Can Substantially Impact a
Country's Gross Domestic Product (GDP), Which Is Closely Linked to The Prices of
Agricultural Products. Therefore, Accurate Crop Price Estimation and Evaluation Are Essential
for Making Informed Decisions Before Farming Particular Crops. Predicting Crop Prices Helps
Farmers Make Informed Decisions, Minimizing Losses and Managing the Risks Associated with
Price Fluctuations. The Problem of Crop Price Prediction by Applying Machine Learning
Methods, Thereby Benefiting Farmers and Modernizing Agricultural Practices to Meet the
Demands Of A Growing Population. A Major Issue Faced by Farmers Is the Lack of Knowledge
About the Ideal Crops to Cultivate Based on Their Soil Quality and Structure. The Proposed
System Utilizes Machine Learning and Prediction Algorithms to Forecast Crop Prices,
Considering Parameters Such as Weather Forecasts, Soil Conditions, And the Wholesale Price
Index (WPI).
1.2 Motivation

The main Motivation behind the Project Crop Price Prediction using Machine learning is:
The agricultural industry is a vital sector that provides food and livelihoods for millions of
people around the world. However, it faces numerous challenges, including climate change,
market volatility, and price fluctuations, which can significantly impact crop yields, farmer
incomes, and food security. One of the most critical factors affecting agricultural sustainability
and profitability is crop
Financial losses: Farmers may struggle to recover their investments, leading to financial distress
and even bankruptcy.
10
Food insecurity: Unstable prices can disrupt food supply chains, affecting the availability and
affordability of nutritious food for consumers.
Environmental degradation: Inefficient farming practices, driven by price uncertainty, can
result in soil degradation, water pollution, and biodiversity loss.
Market inefficiencies: Price volatility can lead to market inefficiencies, reducing the overall
competitiveness of the agricultural sector. price uncertainty.

1.3 Goal
The primary objectives of the crop price prediction using machine learning project are to
develop a reliable and accurate model that can forecast crop prices based on historical data and
other relevant factors. This model aims to provide farmers, policymakers, and other stakeholders
with a decision-making tool that can help them make informed decisions. Develop a model that
can be deployed in a real-world setting and provide accurate predictions, making it a valuable
resource for the agricultural community. Increase the efficiency of agricultural markets by
providing insights into the factors that influence crop prices. Develop a model that can be scaled
up or down depending on the specific needs of the users, making it a versatile tool for a wide
range of applications. Gathers data from reliable sources such as government databases,
agricultural reports. Help with decision-making by providing timely and reliable predictions.
This project focuses on developing a crop price prediction system using the Random
Forest Regression model, which has demonstrated high accuracy in capturing non-linear
relationships within data. This model, incorporated into a web-based platform, allows users to
input variables such as crop name and date and receive real-time price forecasts. Daily updates
ensure the model remains accurate and relevant, supporting farmers, traders, and buyers in
planning their activities and mitigating risks associated with price volatility. This project thus
aims to create a reliable, user-friendly solution that promotes market stability and supports
sustainable agricultural practices.

11
2. Literature Review / Background
The literature indicates that machine learning (ML) methods are powerful tools for
addressing agricultural price volatility, especially given the influence of variables like
seasonality, weather, and market demand. Techniques such as Random Forest Regression and
Decision Tree Regression have been widely used, demonstrating their effectiveness due to their
ability to model complex, non-linear relationships. Random Forest, in particular, has shown high
accuracy in studies focused on agricultural price predictions due to its ensemble nature, which
reduces overfitting and enhances generalization across varied data.
S.No Title Algorithms used Year

1 Deep transfer learning for crop yield prediction with remote sensing data Neural network 2018

2 An approach to forecast grain crop yield using multi-layered, multi-farm Random forest 2019
data sets and machine learning
3 Crop yield prediction using machine learning: A systematic literature Linear Regression, 2020
review Random Forest
4 CROP PRICE PREDICTION USING MACHINE LEARNING Decision tree 2021
Regression
5 Using a novel clustered 3D-CNN model for improving crop future price Clustered 3D-CNN 2022
prediction model
6 Predicting Agricultural Commodities Prices with Machine Learning: A LSTM and neural 2023
Review of Current Research network
7 Crop Price Prediction System Using ML decision tree Regression 2024

Table 1: Recent Review papers related to crop price prediction using machine learning
In Table1 we can see the recent papers related to the crop price prediction using machine learning.

Research shows that using historical price data, meteorological inputs, and socio-
economic factors improves ML model performance. Furthermore, time-series forecasting and
hybrid models that combine multiple ML methods (e.g., Random Forest and 3D Convolutional
Neural Networks) have achieved prediction accuracies upwards of 90%. These hybrid and
advanced models can capture multi-dimensional data, adding robustness to predictions and
assisting in decision-making for farmers and agricultural stakeholders.
Challenges cited in the literature include data quality issues like missing values and
outliers, as well as limitations in real-time adaptability. The review also highlights future
opportunities, such as the integration of IoT devices for on-site data collection and the use of
unstructured data sources (e.g., social media sentiment), which could further enhance the
model’s responsiveness and accuracy in the evolving agricultural market context.
In broad areas such as agriculture where crop prices depend on weather conditions,
seasonality and market condition, machine learning (ML) is a powerful technique in predictive
modeling. Based on the success of Random Forest Regression in techniques for predicting prices
for agriculture, it has now evolved to be one of the most successful methods.

12
This research work provides a ‘crop price prediction system’ which was developed to use
machine learning to help in taking agricultural decisions based on the future prices. The details
involve the use of historical price data, meteorlogical factors as well as socio economic factors as
parameters that enables training of the machine learning models. The intention of the system is to
have an effective way of delivering accurate prices to assist farmers and other stakeholders to make
informed decisions of resource use and farming. When combining decision tree regression and
random forest regression, the system shows potential profits in dealing with economic
unpredictability reflected by the change in crop prices and increasing the sustainability of
agriculture as well as improving the management of the supply chain.
This approach involves collection of data from different sources including government data
sets and then processing, training and testing of the model through supervised learning. To ensure
that the model’s predictions are accurate, the proposed model is trained on 70% of the data while
the remaining 30% is used for testing. Specifically, the decision tree and random forest model are
used to forecast the price for the next 12 months as affected by meteorological and economical
variables. The model’s general accuracy of the crop prices stood at 95 % indicating a high level of
reliability of the model.

S. Methods /
No Title Algorithms parameters Accuracy

Decision Tree
Regression, Historical crop price data,
Crop Price Prediction System Random Forest meteorological data (monthly
1 Using ML Regression rainfall), economic factors 97%

Decision Tree Season, state, area, soil type,


Crop Price Prediction Using Regressor, Random temperature, pH value, rainfall,
2 Machine Learning Forest pesticide usage, fertilizer usage 85%

Random Forest,
3 Crop Price Prediction Linear Regression Yield and past crop prices 98%

An Intelligent Crop Price Decision Tree


Prediction Using Suitable Regression, Random
4 Machine Learning Algorithm Forest Regression Rainfall, historical crop prices 92%

Using a Novel Clustered 3D-CNN temperature, precipitation,


Model for Improving Crop Future Clustered 3D-CNN, economic factors and trading
5 Price Prediction ARIMA data 91.7%

Table 2: Comparision between various models used and accuracy

This paper discusses a system designed for crop price prediction utilizing machine learning
to increase the efficiency of agriculturally related price predictions. It uses the machine learning
algorithms like Decision Tree Regressor And Random Forest algorithms to predict future prices of
crops. It gathers data from sources like Kaggle comprising of factors such as the season, state, area,
type of soil, temperature, pH, rainfall, pesticide utilization and the use of fertilizers. Such
parameters, after being collected, are processed and input into machine learning algorithms that
estimate prices and help in decision making by farmers regarding crop cultivation. A few crops

13
were used in testing the system and the yields were compared to the current prices of crops in the
market. In general, Random Forest yielded better results as compared to Decision Tree with more
accuracy and strong predictions of the model.
The paper by Kener & Kener titled “CROP PRICE PREDICTION” put forward a system
which can be used to predict crop prices to reduce on the loss that is as a result of volatility in crop
prices. As a result of yield and the previous prices, the system has the capability of using data from
the past to forecast future prices. The two predominant algorithms employed are Random Forest
and Linear Regression: the former predicts on the basis of historic crop data for the newest data set.
The model is aimed to help farmers in predicting crop prices so that farmers can undertake better
financial planning. It was established that Linear Regression was more accurate than the other with
accuracy rates of 98% against the 59% of the Random Forest.
The paper titled “An Intelligent Crop Price Prediction Using Suitable Machine Learning
Algorithm ” contains information about predicting crop prices based on the rainfall and price_data.
The system uses Decision Tree and Random Forest regression to give monthly price forecasts that
assists the farmers in determining the most profitable crops to grow, when to harvest and when to
sell or store the produce. It is implemented as a website that is amenable to farmers and end-users in
general. This model uses data acquired from various government sources and the accuracy of the
model’s prediction is at 92% .
This research paper describes a new methodology of crop price forecasting with the help of
a Clustered 3D Convolutional Neural Network (3D-CNN) model. In contrast to the environmental,
economic and trading factors that gives crop prices dictates, crop price which are non-homogeneous
and non-stationary in nature, traditional models such as ARIMA are ineffective. ARIMA being
outperformed by the proposed 3D-CNN model, which was capable of capturing such complexities
in order to predict the future prices of crops such as wheat, oats, rice, corn and soybeans with
greater accuracy. The capability of the model to work with multiple dimensions in data improves
the estimates of the price trends and helps inform a favorable counteraction against food insecurity.

Fig 1: comparison results of accuracy


From the fig1 and table 2 we can see that the accuracy of different models to predict crop price is
displayed how much accuracy they got also displayed.

14
3.System Analysis
3.1 Existing System
It is a website which could be accessed by farmers so that they can base on their financial
conditions, need and feasibility and other metrics choose their desired crop. There will be
multiple crops widely grown in the country. The dash will show the best doing crop along with
the worst and the percentage by which they are soaring or trailing. Predictions will be till 12
oncoming months. We are creating the show data in the form of pie chart and graphs. It has user
friendly interface and decision tree regression is used for prediction we do the In-Depth
statistical analysis of previous data to create the refined platform for interaction which could be
accessed by farmers so that they can on the basis of their prediction for 12 oncoming months.
Firstly from data.gov.in updated dataset will be taken comprising of rainfall and
wholesale prices respectively month wise for every crop. After required pre-processing model
will be trained and then aptly judged. If found suitable, front end and backend will be designed
and the ML model will be deployed at the backend. Requisite updating will be timely done on
the dataset model will be redeployed. Here we are doing supervised learning because we have
multiple inputs, an output and we are deriving a correlation between them.

Fig 2: General process involved in crop price prediction using machine learning
Fig2 shows the general process involved in the crop price prediction of a crop by collecting data and then
processing and then predicting the price.
The two options suitable for this are linear regression and decision tree regression
because both can predict a range of Crop Price Prediction Website for crop forecasting were we
take data from government of 20+ crops and represent the data in a structured manner
representing the increase and decreasing the prices of crops per month and further showing the
crops details like its type, location and export factors for the ease of the farmers to plan and
manage their finances and sown/harvesting accordingly and also one we choose is decision tree
regression because here by observation in the given dataset there is no linear relation between the
inputs and output.

15
The algorithm will take inputs: The input parameters (months, year and monthly rainfall)
PyCharm for training ML model and Visual Studio to develop frontend and backend. Data
ingestion is to be done with the data collected from various sources. The values(continuous)
based on multiple inputs and the injected data is to be prepared according to the requirement of
the system. The Machine Learning model is to be designed and trained using the prepared data.
Evaluation of the model is to be done using standard metrics. If the results are not as per the
requirements, retrain the model. When the desired results are achieved deploy the system.
Decision Tree Regression: The dataset will be divided into multiple leaflets which are result of
multiple decisions of yes and no and then the new data will be calculated based on what leaflet
they land and then calculate the average of that leaflet.
Random Forest Regression: It works on the basis of ensemble learning, which says that if we
combine multiple algorithms or the same algorithm multiple times then we can create a superior
algorithm. Random Forest makes use of multiple decision trees to give the output. As we have a
huge dataset. Random forest will first extract a small chunk of that data feed it to one decision
tree regression model and chain that model this process will be repeated multiple times.
3.2 Proposed System:
At EBI, our project direction is on providing the right solution to the identified problem of
estimating crop prices hence benefiting the poor farmers. On various data sets, we use updated
algorithm solving to enhance different solutions using machine learning methodologies. Namely,
for predicting crop prices we use Decision Tree regression models trained on the valid data. This
application therefore has the potential of increasing efficiency of yield through analysing and
predicting on the impact of environment to crops. A sound system of crop price forecasting can
provide opportunities for buyers as the following cases show. Finally, the results come in the
form of a web application that should be easily available to struggling farmers. The presented
approach of the machine learning based pricing strategy has a methodological advantage in terms
of combining both technical and fundamental analyses.
This segment portrays the plan, design, and functional progression of the proposed crop cost
estimating framework. The framework is made to gauge crop costs in light of numerous
information factors, including crop type, date, authentic estimating data, and outer impacts like
climate. The essential prescient model utilized is Arbitrary Backwoods Relapse, esteemed for its
viability in overseeing multifaceted, non-straight datasets and for limiting overfitting by
coordinating the results of numerous choice trees.
3.2.1 The Data Collection Module: Historical Crop price Data: Including this module, historical
crop pricing data were compiled based on multiple relevant sources such as databases, APIs as
well as markets’ daily reports. Crop Name and Date Data: The main restrictions of the user’s
input are the crop name and the date. These are in turn linked to past price data in order to make
projections. External Data (Optional): While primary system utilizes crop name, date and the
historical cost data, other data that perhaps includes weather condition, market demands and
supply factors can be useful in increasing the chances of the prediction made.

16
Fig 3: ideology for the crop price prediction
In the Fig3 we can see the process of crop price prediction using diagrammatic representation
3.2.2 Data Preprocessing Module:
Data Cleaning: The first step in any data mining exercise is pre-processing of the data collected
and this involves managing issues that include missing values, outliers and the presence of
duplicate records is done. Data quality is assured at this phase by this. Feature Engineering: This
indicates how the system modifies the input data so that useful features are created including:
Normalization and scaling: For the data to supply the model regularly and train effectively, the
input data, for example, historical costs, have to be normalized.
3.2. 3 Machine Learning Model – Random Forest Regression:
Random Forest Regressor was chosen predicated on the versatility of the model in
accommodating a range of datasets as well as on the possibility to identify non-linear
relationships between cropped prices and input features including date,crop
name and cost history. A dataset of crop name with the past prices as well as the related dates is
utilized in training the model. The model builds several decision trees; each of the trees make a
prediction of the crop price with the aid of different subsets of data. Sampling at Random and
Averaging: At the end of each tree, the prediction is made and to get final prediction, all of them
are averaged thus reducing the risk of over fitting and improving the accuracy overall.
Hyperparameter tuning: In order to enhance the model accuracy, the parameters like the
minimum sample split, the maximum depth of the tree and the number of trees are tuned.
3.2.4 Database Management System: Storage of history Data: Crop names, dates, and history
crop prices are all usually found within a single database, central at CME. Due to changes in
consumers preferences, this data is updated from time to time to match the market requirements.
Daily Data Update: This ensures that, in order to have the most recent data used in the
forecasting, the system refreshes the database with new crop prices on daily basis.

17
3.2.5 Prediction Module: Three primary input parameters are supplied by users: this consists of
the name of the crop, the date and historical cost of the crop if available. The inputs are then
calculated by the system with a view of predicting the price of the preferred crop in the future.
Real-time Predictions: In real-time, the system also predicts crop prices with the help of
historical, as well as, current data, making it available for the consumers in almost real-time
mode.
3.2.6 Web-based User Interface (UI): Frontend Interface: They can input the right inputs (the
name of the crop, date and its cost history) through an online portal to the system. It is quite easy
to use and the interfaces are self-explanatory in most of the cases. Price Display and Historical
Trends: To ensure that user understand data presented by the system, historical trends, future
crop prices and other important data is presented in a graphical form. User Query: Users can
interact with it through submitting questions concerning prices of different crops at different
time.

3.3 Software Specifications


1. Frontend:
 Framework: React Or Bootstrap
 Language: JavaScript Or Typescript
 Features:
• User Authentication and Authorization
• Crop Selection and Filtering
• Price Prediction Visualization (Charts, Graphs, Etc.)
• User Input for Custom Predictions
2. Backend:
 Framework: Flask Or Django
 Language: Python
 Features:
• Data Ingestion and Processing
• Model Training and Deployment
• API For Frontend Communication
• Database Integration for Data Storage
3. Database:
 Type: Relational Database (E.G., Mysql) Or Nosql Database (E.G., Mongodb)
 Features:

18
• Data Storage for Crop Prices, Weather Data, And Economic Indicators
• Data Retrieval for Model Training and Prediction
4. Machine Learning:
 Algorithm: Linear Regression, Decision Trees, Random Forest, SVR, LSTM, Or
Ensemble Methods
 Features:
• Data Preprocessing and Feature Engineering
• Model Training and Hyperparameter Tuning
• Model Evaluation and Selection
5. Deployment:
 Platform: Cloud-based (E.G., AWS, Google Cloud, Microsoft Azure) Or On-
premise

19
4. ML TECHNIQUES
4.1 Linear Regression
Linear Regression is one of the simplest yet effective techniques for crop price
prediction. It assumes a linear relationship between the input variables (features) and the target
variable (price). By minimizing the error between predicted and actual prices, the model
generates a line that best fits the data points. This technique works well when the relationship
between variables is approximately linear, such as when predicting crop prices based on a single
factor like rainfall or fertilizer use.
However, Linear Regression struggles with complex, non-linear relationships, which are
often inherent in agricultural markets. For instance, crop prices can be influenced by
combinations of factors like weather patterns, international demand, and pest outbreaks. In such
cases, the model may oversimplify the problem, leading to inaccurate predictions. Despite these
limitations, its interpretability makes it a good starting point for simpler datasets.
4.2 Decision Trees and Random Forests
Decision Trees are non-linear models that split the data into subsets based on feature
thresholds. At each split, the model chooses the feature and value that minimizes prediction
error, creating a tree-like structure to predict crop prices. Random Forests improve upon this by
building multiple trees (ensemble) and averaging their predictions, reducing the risk of
overfitting.
These models are particularly effective in handling datasets with categorical and
continuous variables, such as region, soil type, and historical prices. Random Forests also handle
missing data better than many other models. However, they can be computationally intensive for
large datasets and may not capture temporal dependencies well unless combined with additional
techniques, like lagged variables or time-series-specific preprocessing.
4.3 Support Vector Machines (SVMs)
SVMs aim to find a hyperplane in a high-dimensional space that separates data points of
different classes (in classification tasks) or fits a regression line (in regression tasks). For crop
price prediction, SVMs with kernel tricks can model non-linear relationships effectively,
especially in smaller datasets with a high-dimensional feature space.
Despite their strength in non-linear modeling, SVMs require careful tuning of
hyperparameters, such as the regularization parameter and kernel choice. Additionally, they are
less scalable to large datasets due to their computational complexity. SVMs are often used in
niche scenarios where data is clean, and non-linearity is a key factor but computational resources
are limited.
4.4 Time-Series Models (ARIMA, LSTM)

20
Time-series models like ARIMA (Auto-Regressive Integrated Moving Average) are
explicitly designed to handle temporal data. ARIMA models forecast future crop prices by
analyzing patterns in past data, such as seasonality and trends. While effective for short-term
predictions, they require stationarity in the data, which can be limiting for dynamic, non-
stationary price patterns.
For more complex temporal relationships, deep learning models like LSTMs (Long
Short-Term Memory networks) are highly effective. LSTMs can capture long-term dependencies
in time-series data, making them suitable for forecasting crop prices influenced by extended
weather cycles or global trade trends. However, they demand substantial computational
resources and large datasets, making them best suited for large-scale applications.

21
5.System Design
5.1 System Architecture

Fig 4: Architectural diagram for the crop price prediction


In the Fig4 we can see the process of crop price prediction using diagrammatic representation

This segment portrays the plan, design, and functional progression of the proposed crop cost
estimating framework. The framework is made to gauge crop costs in light of numerous
information factors, including crop type, date, authentic estimating data, and outer impacts like
climate. The essential prescient model utilized is Arbitrary Backwoods Relapse, esteemed for its
viability in overseeing multifaceted, non-straight datasets and for limiting overfitting by
coordinating the results of numerous choice trees.
5.2 Use Case Diagram
The use case diagram outlines the flow of a crop price prediction system, showcasing the
interactions between the user, developer, and the system. The developer begins by uploading the
dataset, which serves as the foundation for the prediction process. The user then provides
additional inputs (e.g., crop type, location, or date) necessary for generating predictions. The
system performs pre-processing on the input data, handling tasks like cleaning, normalization, or
feature extraction to ensure the data is ready for machine learning tasks.
Following pre-processing, the system trains a machine learning model using the prepared
dataset. After training, the model undergoes testing to validate its performance and accuracy.

22
Once testing is complete, the user can request a price prediction based on the inputs provided.
The system processes the request, applies the trained model, and generates a prediction, which is
presented to the user as the final result. This use case reflects a seamless workflow emphasizing
automation and user-centric functionality in crop price prediction.

Fig5: general process in the crop price prediction model in use case format
The flowchart illustrates the process of crop price prediction, starting with data
collection from agricultural sources such as farms, markets, or weather data. The collected data
undergoes a preprocessing stage, which includes cleaning, normalizing, and preparing it for
analysis. This processed data is fed into a machine learning model, which is trained using
historical and contextual data stored in a centralized training data repository. The trained
model learns the relationships between factors influencing crop prices, such as market demand,
weather conditions, and historical trends.
The system integrates with a Firebase backend to manage user interactions and store
additional data dynamically. Users provide specific inputs, such as crop type, location, or season,
through a mobile Android application, which communicates with Firebase to fetch predictions.
The system processes the user inputs using the trained machine learning model, delivering a
prediction that includes the expected crop price. This result is displayed to the user, aiding
decision-making in agricultural planning and trading. The use of Firebase ensures real-time
communication and data flow, making the system interactive and responsive.

23
6.Source Code

24
6.1 Import Libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import LabelEncoder
 pandas: For data manipulation and analysis.
 train_test_split: Splits data into training and testing sets to evaluate model performance.
 RandomForestRegressor: A machine learning algorithm for regression tasks.
 LabelEncoder: Encodes categorical variables (like vegetable names) into numerical
values.
6.2 Load the Data
data = pd. read_csv('/content/new_dataset.csv')
This loads the dataset from the file new_dataset.csv into a pandas DataFrame named data. The
dataset is expected to contain columns like Date, Vegetable, and Cost.
6.3. Handle Missing Values
data.fillna(method='ffill', inplace=True)
 This replaces missing values in the dataset using the forward fill method, where missing
entries are filled with the previous value in the column.
 inplace=True modifies the original DataFrame without creating a copy.
6.4. Encode Categorical Features
le = LabelEncoder()
data['Vegetable'] = le.fit_transform(data['Vegetable'])
 LabelEncoder converts the categorical column Vegetable (e.g., "Tomato", "Peas") into
numerical values (e.g., 0, 1, 2).
 fit_transform:
o fit identifies all unique categories.
o transform maps each category to a unique number.
6.5. Prepare the Data
X = data[['Date', 'Vegetable']]
25
y = data['Cost']
 X: Input features for the model (columns Date and Vegetable).
 y: Target variable (column Cost), which represents the price to be predicted.
6.6. Convert 'Date' to a Numerical Format
X['Date'] = pd.to_datetime(X['Date'], dayfirst=True)
X['Date'] = X['Date'].apply(lambda x: x.timestamp())
 pd.to_datetime: Converts the Date column into a datetime object.
o The parameter dayfirst=True assumes the date format is DD/MM/YYYY.
 timestamp(): Converts the datetime object to a Unix timestamp (a numeric format
representing seconds since 01-01-1970).
This is necessary because machine learning models work only with numerical inputs.
6.7. Split the Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
 train_test_split divides the dataset into:
o 80% training data (X_train, y_train): Used to train the model.
o 20% testing data (X_test, y_test): Used to evaluate the model's performance.
 test_size=0.2: Indicates that 20% of the data will be used for testing.
 random_state=42: Ensures reproducibility of the split.
6.8. Create and Train the Random Forest Model
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
 RandomForestRegressor: A machine learning algorithm that builds multiple decision
trees and averages their predictions for accuracy and robustness.
o n_estimators=100: Creates 100 decision trees.
o random_state=42: Ensures reproducibility.
 fit: Trains the model using the training data (X_train, y_train).
6.9. Make Predictions
predictions = model.predict(X_test)
 predict: Uses the trained model to predict the cost for the testing data (X_test).

26
6.10. Evaluate the Model
from sklearn.metrics import mean_squared_error
rmse = mean_squared_error(y_test, predictions, squared=False)
print(f'Root Mean Squared Error: {rmse}')
 mean_squared_error: Calculates the error between the actual values (y_test) and the
predicted values (predictions).
o squared=False: Returns the root mean squared error (RMSE), a widely used
metric for regression problems.
 RMSE indicates how far the predictions are, on average, from the actual values (lower
RMSE is better).

27
7. Results and Discussion
7.1 Result
Based on the crop data history analyzed through machine learning models and particularly
through Random Forest Regression, the crop price prediction project provides the following
findings. Through Synthesis of Findings, it is indicated that by using features including; date, crop
name, previous cost, it is possible to come up with accurate price predictions. The model also
exposes multiple interactions of these features, and clearly shows other factors such as seasonality,
historical trend of the crop prices, etc have a considerable impact on influencing the price of crops.
Nevertheless, the study also reveals that integrating more external variables such as weather data or
market demand can improve the efficiency of the predictions.
Regarding the Identification of Gaps, Random Forest model is efficient for historical
data and main constrains, however there is no built-in ability to respond to shifts in the market,
which can be induced by external influencing factors, such as natural disasters, changes in the
legislation or unexpected increase in demand. Furthermore, the project focuses mainly on
structured data and so the use of unstructured data in articles or social media sentiments could be
included and integrated for the expansion of the system and its ability to finally report real-time
market conditions as and when they happen.
The Recent Trends that are set in the Field of Crop Price Predictions reveal that enhanced
efforts of more advanced techniques of machine learning technologies like deep learning
techniques, other ensemble learning techniques and integrated techniques containing the
statistics models and other artificial intelligence techniques.
Based on the crop data history analyzed through machine learning models and particularly
through Random Forest Regression, the crop price prediction project provides the following
findings. Through Synthesis of Findings, it is indicated that by using features including; date,
crop name, previous cost, it is possible to come up with accurate price predictions. The model
also exposes multiple interactions of these features, and clearly shows other factors such as
seasonality, historical trend of the crop prices, etc have a considerable impact on influencing the
price of crops.

Random Forest Regression Decision Tree

28
Fig 6: Accuracy comparison between Random Forest Regression and decision Tree Regression
In fig4 we can see the accuracy percentage of different vegetables using algorithms like random
forest and decision trees, these figs conclude that random forest regression giving more accuracy
compared to decision tree regression.
Nevertheless, the study also reveals that integrating more external variables such as weather data or
market demand can improve the efficiency of the predictions.
Another interesting fact to note was the integration of IoT devices in the agriculture where it is
possible that data collected from the field for instances moisture and temperatures feed the models
directly. These developments demonstrate the possibilities of the further steps in creating such work
to move towards the formation of models which specify more factors and assume a higher level of
predictive capacity.
Consequently, the implications for practice are therefore enormous most especially for the farmers,
traders and policy makers who depend more on the crop price forecast. It can also help the users to
predict the trends in prices, planting and market supplies within the respective systems.
Also it is important to know how the external conditions have to be taken to make better short-
term predictions. Hence, as a conclusion, this work supports the statement that the utilization of
machine learning is possible for crop price prediction and although some limitations are identified
in this study, the further studies and advancements can overcome them and find more trends that
can enhance the efficiency of the designed system and its suitability.

Random Forest Regression Decision Tree Regression


Fig 7: Comparision between actual and predicted values using Decision Tree regression
from Fig5 we can see that the predicted values and actual values are compared by using
random forest regression and Decision tree regression algorithms, we can conclude that the Random
Forest Regression is giving very accurate and precise values on prediction to actual values.

The Implications for Research herein include other dimensions of the machine as learning methods
and other features including the weather, satellites and stream of social economic data.

29
Model R-square Mean absolute error Mean square Table
2: (MAE) error (MSE)
Random forest regression 0.98 150 3400
Decision Tree Regression 0.89 190 5000
Linear regression 0.80 250 6000
Comparision on R-square, MAE, MSE of different models

In Table 2, different models utilized in crop price prediction project are compared, and their
performance is assessed using statistical measures including R-square, MAE, and MSE.

Fig 8: Welcome to Crop Price Predictor – Predict and Track Vegetable Prices
The primary purpose of the “Crop Price Predictor” web page is to remain a simple interface for
people who seek crop/vegetable price predict inspiration and current vegetable value statistics. The
main content area prominently features two action buttons: “Prediction,” which most probably
stands for data that can be used to predict crop prices and “Display Cost,” which presumably
indicates today’s prices of different vegetables.

30
Fig 9: Make a Crop Price Prediction
This application permits users to develop predictions of crop prices on a given day and of a given
crop selected in the application’s main part, there is a calendar where the user can choose a specific
day in the format of dd-mm-yyyy and the crop name. users can select their preferred options in
order to click the “Predict” button to activate the prediction.

Fig 10: Crop Price Prediction Results

31
This “Prediction Results” web page of the “Crop Price Predictor” application shows the price of
a certain crop by a chosen date. It consists of significant data that are the name of the crop (e.g.,
Tomato), the intended date of planting (e.g., 13-09-2024), and anticipated price per crop (e.g. ₹
22.98). The page is clean and essentially informs the user about the prediction made.

Fig 11: Daily Crop Price Updates - Crop Price Predictor


By sometimes, the "Crop Price Predictor" website gives the latest information on the prices of
various vegetables. The layout has a table of the different vegetables with their prices for the day
and the date of the update. It involves normal used vegetables like onions, tomato, green chilli,
potato, etc.
7.2 Discussion
Through Synthesis of Findings, it is indicated that by using features including; date, crop name,
previous cost, it is possible to come up with accurate price predictions. The model also exposes
multiple interactions of these features, and clearly shows other factors such as seasonality, historical
trend of the crop prices, etc have a considerable impact on influencing the price of crops. Based on
the crop data history analyzed through machine learning models and particularly through Random
Forest Regression, the crop price prediction project provides the following findingsRegarding the
Identification of Gaps, Random Forest model is efficient for historical data and main constrains,
however there is no built-in ability to respond to shifts in the market, which can be induced by
external influencing factors, such as natural disasters, changes in the legislation or unexpected
increase in demand. Furthermore, the project focuses mainly on structured data and so the use of
unstructured data in articles or social media sentiments could be included and integrated for the
expansion of the system and its ability to finally report real-time market conditions as and when
they happen.

32
8. Conclusion

This project successfully demonstrates how machine learning, specifically Random


Forest Regression, can be leveraged to predict crop prices with high accuracy, helping farmers,
traders, and other stakeholders make informed decisions in an often unpredictable agricultural
market. By using data-driven insights from historical prices, dates, and crop types, the model
delivers reliable, real-time predictions through a web-based platform. This system not only
empowers users with timely information but also addresses a critical need in agriculture by
providing a tool to reduce financial risks associated with crop price volatility.
The project’s main contributions include a user-friendly platform for price prediction, a
robust data pipeline that integrates real-time data updates, and a high-accuracy prediction model
that can capture complex price patterns influenced by seasonality and market trends. These
contributions underscore the potential of machine learning to bring stability and transparency to
agricultural markets, ultimately benefiting the economy and enhancing food security.
However, there are areas for improvement and further research. Integrating additional
real-time data sources, such as weather patterns, soil conditions, and market demand indicators,
could enhance the model's responsiveness to sudden changes. The inclusion of unstructured data,
such as news articles and social media sentiment, may also improve the model’s adaptability to
market sentiment. Future research could explore deep learning models like LSTMs or ensemble
techniques to further improve prediction accuracy and better handle complex, long-term
dependencies in data.
This project lays the foundation for a more adaptable and intelligent crop price prediction
system, with potential applications that extend beyond agriculture to support sustainable
economic growth and resilience in the face of market fluctuations.

33
9. Future Scope
Crop price prediction project using Random Forest Regression, based on parameters like
date, crop name, and historical price data, is expansive. By integrating additional features such as
weather data (temperature, rainfall), soil quality, and market dynamics, the model can provide
more accurate predictions. Including macroeconomic indicators like subsidies, inflation, and
trade policies would further enhance the robustness of the predictions. Scaling the model for
multi-crop and multi-regional scenarios can address local variations in pricing and market
conditions, making it adaptable to diverse agricultural ecosystems.
Advancements in technology present opportunities to evolve the project into a real-time
prediction system. By linking with APIs or IoT devices for live data inputs, predictions can be
dynamically updated, enabling immediate decision-making. Deployment on cloud platforms
ensures scalability, while user-friendly mobile apps or web dashboards can deliver actionable
insights directly to farmers, traders, and policymakers. Additionally, integrating Random Forest
with hybrid or time-series models would further improve forecasting accuracy, particularly for
seasonal trends and long-term analysis.
Our project holds significant potential to impact sustainable agriculture and economic
planning. Farmers can make informed decisions about crop sales to maximize profits, reducing
post-harvest losses and stabilizing incomes. Governments and businesses can leverage
predictions to optimize supply chain logistics, allocate resources efficiently, and design better
policies, such as minimum support prices or crop insurance. As the model evolves, it can become
a critical tool for addressing challenges like food security, market volatility, and environmental
sustainability in agriculture.

34
10. References
1. J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun. Graph neural
networks: A review of methods and applications. AI Open, 1:57—81, 2020.
2. Hangzhi Guo, Alexander Woodruff, and Amulya Yadav. Improving lives of indebted farmers
using deep learning:
3. Predicting agricultural produce prices using convolutional neural networks. In The Thirty-
Second Innovative Applications of Artificial Intelligence Conference (IAAI-20), 2020.
4. Thomas Van Klompenburg, Ayalew Kassahun, and Cagatay Catal. Crop yield prediction using
machine learning: A systematic literature review. Computers and Electronics in Agriculture,
177:105709, 2020.
5. Wei Ma, Kendall Nowocin, Niraj Marathe, and George H Chen. An interpretable produce price
forecasting system for small and marginal farmers in india using collaborative filtering and
adaptive nearest neighbors. In Proceedings of the Tenth International Conference on
Information and Communication Technologies and Development, pages 1–11, 2019.
6. Lovish Madaan, Ankur Sharma, Praneet Khandelwal, Shivank Goel, Parag Singla, and
Aaditeshwar Seth. Price forecasting and anomaly detection for agricultural commodities in
india. In ACM Conference on Computing and Sustainable Societies, COMPASS-2019, July
2019.
7. Ghutake, I., Verma, R., Chaudhari, R., & Amarsinh, V. (2021). An intelligent crop price
prediction using suitable machine learning algorithm. In ITM web of conferences (Vol. 40, p.
03040). EDP Sciences.
8. Bayona-Oré, S., Cerna, R., & Tirado Hinojoza, E. (2021). Machine learning for price prediction
for agricultural products.
9. Samuel, P., Sahithi, B., Saheli, T., Ramanika, D., & Kumar, N. A. (2020). Crop price prediction
system using machine learning algorithms. Quest Journals Journal of Software Engineering and
Simulation.
10. Dhanapal, R., AjanRaj, A., Balavinayagapragathish, S., & Balaji, J. (2021, May). Crop price
prediction using supervised machine learning algorithms. In Journal of Physics: Conference
Series (Vol. 1916, No. 1, p. 012042). IOP Publishing.
11. Elbasi, E., Zaki, C., Topcu, A. E., Abdelbaki, W., Zreikat, A. I., Cina, E., ... & Saker, L. (2023).
Crop prediction model using machine learning algorithms. Applied Sciences, 13(16), 9288.
12. Bai, S.; Kolter, J. Z.; And Koltun, V. 2018. An Empirical Evaluation of Generic Convolutional
and Recurrent Networks for Sequence Modelling. Arxiv Preprint Arxiv:1803.01271.
13. W.G. Tomek, R.W. Gray, “Temporal Relationships Among Prices on Commodity Futures
Markets: Their Allocative and Stabilizing Roles,” American Journal of Agricultural Economics,
52(3), 372–380, 1970.
14. Vohra, Aman, Nitin Pandey, And S. K. Khatri. "Decision Making Support System for Prediction
of Prices In Agricultural Commodity." 2019 Amity International Conference On Artificial
Intelligence (AICAI). Ieee, 2019

35
15. Gupta, Sarthak, Agarwal, A., Deep, P., Vaish, S., & Purwar, A. (2020). Analysis Of Minimum
Support Price Prediction for Indian Crops Using Machine Learning and Numerical Methods.
International Conference on Innovative Computing and Communications, 267–277
16. Ashwini Darekar and a Amarender Reddy “Cotton Price Forecasting in
MajorProducingStates”,2017.
17. Boge, F. J., Poznic, M. (2020). Machine learning and the future of scientific explanation. Journal
for General Philosophy of Science, 1-6.
18. Saha, D., Annamalai, M. (2021). Machine learning techniques for analysis of hyperspectral
images to
determine quality of food products: A review. Current Research in Food Science, Vol. 4, pp. 28-
44.
19. Subasi, A. (2020). Machine learning techniques. Practical machine learning for data analysis
using Python, Academic Press.
20. Mohri (2018). Foundations of machine learning, MIT Press.

36

You might also like