0% found this document useful (0 votes)
14 views4 pages

Research Notes Draft 2

The document compares and contrasts regression analysis and cluster analysis in agriculture. Regression analysis is a supervised learning method used to predict outcomes based on input variables, while cluster analysis is an unsupervised learning method that groups similar data points together without labels. Both methods have advantages and disadvantages for applications in agriculture such as predicting crop yields or grouping farming practices.

Uploaded by

olorato
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views4 pages

Research Notes Draft 2

The document compares and contrasts regression analysis and cluster analysis in agriculture. Regression analysis is a supervised learning method used to predict outcomes based on input variables, while cluster analysis is an unsupervised learning method that groups similar data points together without labels. Both methods have advantages and disadvantages for applications in agriculture such as predicting crop yields or grouping farming practices.

Uploaded by

olorato
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Comparing Cluster and Regression Analysis for Small-scale Farming

Aspects Regression Analysis Cluster Analysis

Learning type Supervised Learning Unsupervised Learning

Definition Regression analysis in Cluster analysis in agriculture is like


agriculture is used to predict sorting vegetables into different groups
outcomes based on input based on their similarities. But instead
variables. It helps in of using your eyes to sort them, you
understanding how the typical use special tools and methods to
value of the dependent variable analyze data from sensors. These
changes when any one of the sensors measure things like the health
independent variables is varied or chemical makeup of plants.

Goal To predict numbers before they To group data points to understand


occur their collective behavior.

Types Linear, multiple linear and logistic Clustering - K-means

Input & output Input: Features and labels; Input: Unlabeled data; Output: Clusters
data Output: Predicted labels. For of similar data points. For example,
example, crop yield could be farmers could be clustered based on
predicted based on weather their farming practices
conditions and soil quality

Advantages Regression analysis helps in Cluster analysis can help in


making prediction and consolidating products and delivering
forecasting for business in near them in bulk to save on transportation
and long term. It supports costs while increasing income. It also
business decisions by providing requires fewer resources for the
necessary information related to sampling process, making it generally
dependent target and predictors cheaper than simple random or
stratified sampling

Disadvantages Regression models cannot work The challenge with cluster analysis is
properly if the input data has that there are many different algorithms
errors (that is poor quality data). producing different outcomes. Also, if
If the data preprocessing is not the data preprocessing is not performed
performed well to remove well to remove missing values or
missing values or redundant data redundant data or outliers or
or outliers or imbalanced data imbalanced data distribution, the
distribution, the validity of the validity of the cluster model suffers.
regression model suffers

Usage Used in machine learning, image Used for prediction and forecasting,
analysis, data mining, pattern assessing the strength of the
recognition. Popular in marketing relationship between variables,
for customer segmentation modeling the future relationship
between them. Widely used in financial
analysis

Use of labels Regression analysis uses labels. Cluster analysis doesn’t use labels. It
It predicts a dependent variable groups data based on similarities
(label) based on independent among the features
variables (features)

Challenges The challenge with regression The challenge with cluster analysis is
analysis is that it requires a large that it requires a good understanding of
amount of data for accurate the data and the right choice of
predictions. Also, the choice of clustering algorithm
the regression model is crucial

Benefits It helps in predicting the yield It helps in studying and analyzing


based on various factors. This vegetation’s biophysical or biochemical
can lead to better resource status. This can lead to better crop
allocation and increased management and yield
efficiency

Data Regression analysis requires a Cluster analysis requires a good


Requirements large amount of data for accurate understanding of the data and the right
predictions. It requires labeled choice of clustering algorithm. It can
data and the choice of the handle unlabeled data and is often
regression model is crucial used to model and analyze data with
small sample sizes.

Application Regression analysis is used to Cluster analysis is used to evaluate


analyze the efficiency of small- plant characteristics quantitatively. It
scale maize farmers. It helps in can also be used to identify patterns in
understanding the impact of agricultural supply chain data
various factors like the level of
education, experience in farming,
access to irrigation water,
purchase of hybrid seed, access
to credit, and extension visits on
the efficiency of farmers

Interpretability Regression analysis is a Cluster analysis is an unsupervised


supervised learning method that learning method that groups similar
predicts a dependent variable data points together. The interpretability
based on independent variables. of cluster analysis is generally lower
It’s generally more interpretable than regression analysis because it
because it provides a clear doesn’t use labels
relationship between input and
output

Data Similar to cluster analysis, data Data preprocessing for cluster analysis
Preprocessing preprocessing for regression involves dealing with missing or
analysis involves cleaning, erroneous data, transforming data into
transforming, and normalizing a usable format, normalizing data, and
data reducing dimensionality

Model Regression analysis has a clear Cluster analysis doesn’t have a natural
evaluation measure of accuracy, which is measure of accuracy. Instead, the goal
typically the difference between is to group objects into clusters based
the predicted and actual values only on their observable features

References:

Arevalo-Ramirez, T. and Auat Cheein, F. (2023) ‘Cluster Analysis for Agriculture’,


Encyclopedia of Smart Agriculture Technologies, pp. 1–8. doi:10.1007/978-3-030-
89123-7_189-1.

Bertsimas, D., Orfanoudaki, A. and Wiberg, H. (2020) ‘Interpretable clustering: An


optimization approach’, Machine Learning, 110(1), pp. 89–138. doi:10.1007/s10994-
020-05896-2.

CFI Team (2023) Cluster sampling, Corporate Finance Institute. Available at:
https://corporatefinanceinstitute.com/resources/data-science/cluster-sampling/
(Accessed: 01 March 2024).

Das, S. (2017) Decision trees vs. clustering algorithms vs. linear regression - dzone,
dzone.com. Available at: https://dzone.com/articles/decision-trees-v-clustering-
algorithms-v-linear-re (Accessed: 01 March 2024).

Ergando, H.M. (2023) Wheat Cluster Farming Approach: Challenges and prospects for
smallholder farmers in Ethiopia. Available at:
https://publications.waset.org/abstracts/166374/wheat-cluster-farming-approach-
challenges-and-prospects-for-smallholder-farmers-in-ethiopia (Accessed: 01 March
2024).

Etumnu, C. and Gray, A.W. (2020) ‘A clustering approach to understanding farmers’


success strategies’, Journal of Agricultural and Applied Economics, 52(3), pp. 335–351.
doi:10.1017/aae.2020.4.

Galli, S. (2023) Mastering data preprocessing: Techniques and best practices, Train in
Data Blog. Available at: https://www.blog.trainindata.com/mastering-data-
preprocessing-techniques/ (Accessed: 01 March 2024).
Hassan, A. (2023) What is cluster analysis?, Built In. Available at:
https://builtin.com/data-science/cluster-analysis (Accessed: 01 March 2024).

Hassan, M. (2023) Cluster analysis - types, methods and examples, Research Method.
Available at: https://researchmethod.net/cluster-analysis/ (Accessed: 01 March 2024).

Lesire, I. (2022) Cluster analysis: A data-informed approach to improving smallholder


livelihoods, The Akvo blog. Available at: https://datajourney.akvo.org/blog/cluster-
analysis-smallholder-farmers (Accessed: 29 March 2024).

Majumdar, J., Naraseeyappa, S. and Ankalaki, S. (2017) ‘Analysis of agriculture data


using data mining techniques: Application of big data’, Journal of Big Data, 4(1), pp. 1–
15. doi:10.1186/s40537-017-0077-4.

Mishra, S. (2017) Unsupervised learning and data clustering, Medium. Available at:
https://towardsdatascience.com/unsupervised-learning-and-data-clustering-
eeecb78b422a (Accessed: 01 March 2024).

MOKGALABONE, M.S. (2015) ‘ANALYZING THE TECHNICAL AND ALLOCATIVE


EFFICIENCY OF SMALLSCALE MAIZE FARMERS IN TZANEEN MUNICIPALITY OF
MOPANI DISTRICT: A COBB-DOUGLAS AND LOGISTIC REGRESSION
APPROACH’.

Query and Search, S. (2023) Data Analysis Part 5: Data Classification, clustering, and
regression, Query. Available at: https://www.query.ai/resources/blogs/data-analysis-
part-5-data-classification-clustering-and-regression/ (Accessed: 01 March 2024).

Regression analysis: Types, importance and limitations (2020) CommerceMates.


Available at: https://commercemates.com/regression-analysis/ (Accessed: 01 March
2024).

You might also like