Soil Classification and Crop Suggestion Using Deep Learning Techniques: Soilla
Soil Classification and Crop Suggestion Using Deep Learning Techniques: Soilla
BY
PATRICK OKELLO
2018/BCS/050/PS
Bachelors of Computer Science
Email: [email protected]
Tel:+256-787250196
Supervised By
Mr. Mwavu Rogers
[email protected] +(256)-700-497-421
MSc. Information System, (MUST-Uganda)
February 8, 2022
Declaration
I OKELLO PATRICK declares that this work has been produced by me as a partial fulfill-
ment of the requirements for the award of the degree of Bachelor of Computer Science of
Mbarara University of Science and Technology. I hereby state that this is my very own work
and it has not been submitted to any other institution for another degree or qualification.
Throughout the work I have acknowledged all sources used in its compilation.
Signature Date
1
Approval
This final year project has been submitted for examination with the approval of the follow-
ing supervisor.
Signature Date
2
Dedication
I would like to dedicate this report to my Mom Kasfa Aceng and entire Family Members,
relatives, and my supervisor Mr. Mwavu Rogers plus the entire MUST fraternity for their
contribution inform of guidance, advice and financial help towards the completion of this
project.
I also dedicate this report to two best friends of mine, Musiime Claire and Kemigisa Precious
Joan. I can’t thank you both enough for all you have done for me.
3
Acknowledgment
With heartfelt gratitude, I humbly thank the Almighty for His providence throughout the
project. I would like to extend my special thanks to several people who generously con-
tributed towards this project most especially my sponsors for the financial support through-
out this project and the staff of Kakebe Technologies for the support and cooperation that
helped me to achieve the goal of the project. I acknowledge Aguma Gloria Kabasinguzi for
her efforts towards helping in Collecting and writing the Soil knowledge bases used in this
Report
I also wish to thank my father Rev. Dickens Anyati, friends, and family for their finan-
cial support and my fellow students for their cooperation during this course of the project.
4
Abstract
The objective of this study was to process the soil images to generate a digital soil classifi-
cation and crop suggestion system for rural farmers at a low cost. Soil texture is the main
factor to be considered before doing cultivation [2]. Knowing the soil’s texture helps to pre-
dict how it will behave under different conditions and it’s the first step toward creating the
best conditions for the plants being grown. Soil Texture affects crop selection and regulates
the water transmission property [11]. The conventional hydrometer method [16] determines
the percentage of sand, silt, and clay present in a soil sample. This method is a very costly
and time-consuming process. In this approach, we collected 50 soil samples from the dif-
ferent parts around Mbarara University within a five-kilometer radius. The samples were
photographed under a constant light condition using Android Mobile Phone cameras. The
fraction of sand, silt, and clay of the soil samples is then determined using the hydrometer
test. The result of the hydrometer test was processed with the United States Department
of Agriculture soil classification triangle [13] for the final soil classification. Soil images
were processed through the different stages like pre-processing of soil images for image en-
hancement, extracting the region of interest for segmentation and the texture analysis for
Building a Dataset. The Dataset was composed of four major classes/labels i.e sand, clay,
and loam soil images, and the fourth class/label being the negative of all of the first three
classes i.e random images which are not soil. A TensorFlow - Keras Image classification [14]
model was built on top of a Convolutional Neural Network (CNN) pre-trained on Imagenet
Dataset. Finally, the trained image Classifier was converted into a quantized TensorFlow
lite model and embedded in an android mobile app that loads images from the phone gallery
or through the camera view and classifies the soil images while offering suggestions for crops
suitable for the identified soil type. The proposed method gave an average of 91.37 percent
accuracy for all the soil samples.
5
Contents
1 Introduction 10
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 General Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Specific Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7.1 Value Preposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7.2 Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7.3 Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.8 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.8.1 The Boundary of Research . . . . . . . . . . . . . . . . . . . . . . . . 15
1.8.2 Technical scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.8.3 Time scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Literature Review 17
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Existing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 SoilCare scanner: Soil Testing System . . . . . . . . . . . . . . . . . . 17
2.2.2 Soil Classification using Machine Learning Methods and Crop Sugges-
tion Based on Soil Series . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.3 Big Data Analytics for Crop Prediction Model Using Optimization
Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.4 Improving Crop Productivity Through A Crop Recommendation Sys-
tem Using Ensembling Technique . . . . . . . . . . . . . . . . . . . . 21
2.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Methodology 24
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6
3.2 The Design Science Research Methodology . . . . . . . . . . . . . . . . . . . 24
3.2.1 Problem Awareness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 Suggestion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.3 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.4 System evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Presentation of Results 27
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Presentation of results of objective 1 . . . . . . . . . . . . . . . . . . . . . . 27
4.3 Presentation of results of objective 2 . . . . . . . . . . . . . . . . . . . . . . 30
4.3.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4 Presentation of results of objective 3 . . . . . . . . . . . . . . . . . . . . . . 31
7
List of Figures
3.1 The general steps of the Design science research methodology . . . . . . . . 24
4.1 A figure showing the output of the training process of a VGG16 Model! . . . 27
4.2 A figure showing the output of the training process of a Inception Model! . . 28
4.3 A figure showing the performace of a ResNet50 Model . . . . . . . . . . . . . 29
4.4 A figure showing the performance of an Efficient Net Model! . . . . . . . . . 29
4.5 A figure showing the performance Comparison of all 4 models! . . . . . . . . 30
5.1 Left: The original VGG16 network architecture. Middle: Removing the FC
layers from VGG16 and treating the final POOL layer as a feature extractor.
Right: Removing the original FC Layers and replacing them with a brand
new FC head. These FC layers can then be fine-tuned to a specific dataset
(the old FC Layers are no longer used). . . . . . . . . . . . . . . . . . . . . . 36
5.2 Left: When we start the fine-tuning process, we freeze all CONV layers in
the network and only allow the gradient to back-propagate through the FC
layers. Doing this allows our network to “warm up”. Right: After the FC
layers have had a chance to warm up, we may choose to unfreeze all or some
of the layers earlier in the network and allow each of them to be fine-tuned
as well. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3 Our Keras fine-tuning network is allowed to “warm up” prior to unfreezing
only the final block of CONV layers in VGG16. . . . . . . . . . . . . . . . . 37
5.4 We have unfrozen the final CONV block and resumed fine-tuning with Keras
and deep learning. Training and validation loss are starting to divide indi-
cating the start of over-fitting, so fine-tuning stops at epoch 20. . . . . . . . 38
5.5 visual view of the model in Netron . . . . . . . . . . . . . . . . . . . . . . . 39
5.6 A figure showing the propertise of the trained model . . . . . . . . . . . . . 39
5.7 Mobile Home Screen View . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.8 Mobile analytics Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.9 Mobile Classification Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.10 Mobile Graph Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8
5.11 Mobile Crop suggesstion View . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.12 Mobile guide screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
9
Chapter 1: Introduction
1.1 Overview
Knowledge of soil classification helps predict soil behavior. Soil behavior helps predict the
performance of the soil in which the crop grows. Technological advances have spawned the
use and application of smart agriculture where Farmers are adopting new technologies such
as irrigation, soil sampling, and profiling systems to increase crop production and income,
but Ugandan farmers continue to face challenges in implementing the above systems [9] thus
facing high risks of losing investment and facing poor crop yields and low income since they
are unable to classify their soil types and also know the right crops to plant on a given soil.
The main objective of this project was to improve soil classification and crop suggestion
in low resource settings of Uganda. Some of the specific objectives included examining the
existing applications or systems that are used to classify soil samples and suggesting crops
to identify the parameters required to design an image classifier model and also improve on
existing methods. It further described the results which led to the second objective which
was to design and develop a model for soil texture classification and crop suggestion. A
model was built on top a CNN network (VGG16 network [20] ) pre-rained on image-net
dataset [8]. This second objective further led to the completion of the third objective which
was to develop an android mobile application based on the designed model. Finally, the last
objective was to test and validate the designed application, and thus results were produced
and analyzed.
The research was guided by design science methodology [3] to easily achieve the research
objectives and answer the research questions. a literature survey method was used to ob-
tain the requirements of the model and to clearly understand the problem that was being
solved. the requirements obtained in objective one were used in the model design where
four classes/labels (i.e. clay, soil, sand, and unknown soil type) were used to classify a given
image. Tensor-flow, Keras and Python, and associated libraries were used to design the net-
work and train the model. A custom python script was used to convert the trained h5 Keras
model into a quantized Tensorflow lite model to be embedded on the android application.
10
Figma which is a UI/Ux design online software was used to design the android application
interfaces and also simulate the interactions within the mobile application. later on, the
mobile application called Soilla was developed to enable the farmers to carry out the soil
classification and crop suggestion with ease and offline, this application was designed based
on the model designed in the second objective. Android studio software platform that sup-
ports Java and XML languages was used to develop the mobile application. The designed
model and mobile application may be used by farmers and agricultural workers to classify
soils and get suggestions of crops to be planted in a given soil sample in different regions of
southwestern Uganda.
The next sections of this report are as follows:- chapter one consists of background,
problem statement, objectives, research questions, scope, significance. chapter two contains
literature organized in different themes depending on the specific objectives, chapter three
covers the different methods that were used to achieve the specific objectives, chapter four
discusses the results and findings from the research questions, chapter five discusses the
system mobile application development and implementation and lastly, chapter six discusses
the conclusions and recommendations to the entire project.
1.2 Background
The world has not been generally progressing either towards ensuring access to safe, nutri-
tious, and sufficient food for all people all year round (SDG Target 2.1), or to eradicating all
forms of malnutrition (SDG Target 2.2). [7] Conflict, climate variability and extremes, and
economic slowdowns and downturns are the major drivers slowing down progress, particu-
larly where inequality is high. The COVID-19 pandemic made the pathway towards SDG2
even steeper. In 2020, between 720 and 811 million people faced hunger. [7]
The number of people in the world affected by hunger increased in 2020 under the shadow
of the COVID-19 pandemic. After remaining virtually unchanged from 2014 to 2019, the
prevalence of undernourishment (PoU) climbed to around 9.9 percent in 2020, from 8.4
percent a year earlier [7]. In terms of population, taking into consideration the additional
statistical uncertainty, it is estimated that between 720 and 811 million people in the world
faced hunger in 2020. Considering the middle of the projected range (768 million), 118
million more people were facing hunger in 2020 than in 2019 – or as many as 161 million,
11
considering the upper bound of the range. [7]
Uganda has a fast-growing population and suffers from widespread poverty [9]. Its pop-
ulation, at 43 million people in 2017, is growing at 3.3 percent per year and is expected to
exceed 100 million by 2050 (UN DESA Population Division 2017). Uganda’s poverty rate
in 2016 was 41.6 percent, down from more than 60 percent in the 1990s, but with some
fluctuation in recent years. Its GDP per capita in 2017 was just $606 in current US dollars,
much less than half of the average for Africa south of the Sahara, which stands at $1,574
(Word Bank 2019). Agriculture plays a vital role in the Ugandan economy [5]. Sixty-eight
percent of Ugandans are employed in agriculture, 7 percent in industry, and 25 percent in
services. Meanwhile, 25 percent of GDP comes from agriculture, 20 percent from industry,
and 47 percent from services. Uganda has high levels of biodiversity, rich volcanic soils,
multiple freshwater lakes with irrigation potential, and two rainy seasons per year, which
are beneficial to agricultural production. Yet agriculture in Uganda has been plagued by
droughts and damaging diseases and pests [9]. To raise productivity and improve food se-
curity, Uganda will need to boost extension services and farmers’ use of inputs and reduce
their post-harvest losses.
Unless bold actions are taken to accelerate progress, especially actions to address major
drivers of food insecurity and malnutrition and the inequalities affecting the access of mil-
lions to food, hunger will not be eradicated by 2030. According to Food and Agricultural
Organization for the United Nations (FAO), [7], there are six pathways to follow towards
food systems transformation depending on context and one of them was intervening along
the food supply chains [7] to lower the cost of nutritious foods. The high cost of food can
be traced back and determined by how the food crops are grown and how farmers can have
high crop yields to increase income earnings. Poor crop yields lead to an increase in cost
due to excessive demands for limited food crops [12]. Therefore there is a need to focus on
how crops are grown and ways to improve yields to produce enough food to cater to the
increasingly growing population.
Soil research is directed towards helping nature and the farmer to grow food in sufficient
quantities to meet the world’s needs [7]. The ever-increasing intensification of agriculture
means that several different disciplines and specialized skills are now required to elucidate
12
the complex of soil-plant-animal relationships [9]. Several computer science approaches have
been applied in classifying soil types using various properties such as soil texture [10], color,
and area. Most of the approaches are never implemented in either a mobile or web appli-
cation that farmers can use to know the soil types and also get suggestions about the crops
to plant.
Given the rapid increase in population in Uganda with agriculture being the backbone
of the economy, there is a need for advancement in agricultural practices. Farmers need
better and efficient tools to analyze and classify soil without the need for an expert. This
will lead to improved crop yields and food production to cater to the growing population.
This has therefore motivated the researcher to develop a mobile application that uses deep
learning techniques to classify soil samples and suggest food crops to be planted on any
given soil. farmers can use this application offline and do not need an internet connection
to know their soil types or crops to plant in them. The Mobile application can also be used
by agricultural officials to help build a national map of soil profiles across the country to
help act as a point of reference.
13
enough food crops to feed the increasing population
1. To examine the current methods used for soil classification and crop suggestion.
2. To design and develop an Image Classification Model for soil texture classification and
crop suggestion and develop a mobile application based on the developed model.
3. Test and validate the developed model for soil classification and crop suggestion
1. What are the current soil classification and crop suggestion tools being used to classify
soil and get crop suggestions for a particular soil sample.
2. What can be considered in designing and developing a deep learning soil image classi-
fication model and how to can such a model be embedded in a mobile application for
it to function offline?
3. What is the performance of the application about the actual soil samples and crops
tested on the soil sample?
1.7 Significance
I was aiming to reduce the rate of crop failures and increase efficient utilization of soil and
allow proper land management. This is achieved through correct classification of the soil
14
types and suggestions of crops for a given soil type using deep learning techniques.
The Soil classification and Crop suggestion Application was aimed at providing convenient
and high performance an alternative way in which soil samples can be easily identified and
suitable crops are suggested for that soil sample. It was also readily accessible since the
designed model was applied in a mobile application which could easily be downloaded from
google play store by anyone who owned a smartphone and according to [22], Internet users
in the emerging world were more frequent users of social networks compared with U.S. and
Europe hence an emerging trend in internet usage.
1.7.2 Innovation
Using the soil types classification and crop suggestion system, crop yields of farmers will
improve leading to an increase in income earnings and enough food supply to feed the
ever-growing population of Uganda.
1.7.3 Impact
The mobile application will help farmers to make the best decisions on the right crops to
plant on a particular soil type and also it will help agricultural researchers to easily and
efficiently classify soil types and hence improving their efficiency in doing research. This
project was aimed at benefiting several individuals such as;
1. Farmers. To classify soil samples and know the right crops to plant on a given soil
2. Researchers. To improve the process of soil classification and crop suggestions and
help build the national soil map of Uganda
1.8 Scope
The research was focused on classifying three major soil types i.e sand, loam, and clay
soil. Geographically, the project was limited to only southwestern Uganda and areas around
Mbarara University within a five-kilometer radius.
15
1.8.2 Technical scope
The design was done using a visual paradigm for the UML diagrams, pencil for the wire-
frames and flowcharts, Figma [6] was used for UI/UX design of the mobile application.
Development of the mobile application was done using java and XML and the android
studio is the IDE. The deep learning image classification model was developed using Ten-
sorFlow, Keras, OpenCV, and python and associated libraries such as pandas and NumPy
and the base model is a CNN VGG16 Network pre-trained on Image-net Dataset.
The whole project process was run for a period of 8 months that is from February - De-
cember 2021 which covered proposal approval, data collection, proposal submission, project
development and lastly implementation.
16
Chapter 2: Literature Review
2.1 Introduction
In this section, the researcher reviewed several different systems or applications, or projects
that are used in soil classification and crop suggestions in the different countries and in
Uganda to improve crop yields. During the process, the researcher focused on finding out
the way how the different systems were designed and developed, services offered by the
systems or applications on implementation to come up with their strengths and weaknesses
as described below;
The SoilCare [18] scanner was a portable device that uses near-infrared technology and a
connection to SoilCare global soil database to accurately determine the soil’s properties. It
measures the amount of Nitrogen (N), Phosphorous (P) and Potassium (K), and electric
conductivity in the soil and determines the temperature, pH, and organic matter level. It
was an initiative by Agri Pro Focus, a network consisting of multiple stakeholders dedicated
to supporting farmers in achieving their objective in commercial farming.
Benefits
1. It was an easy-to-use hand-held device conducting fast on-the-spot soil quality checks
upon which the expert can make local crop-specific fertilizer recommendations.
2. It provides a soil report consisting of soil status, suitable crop types to be grown on a
specific soil type, and fertilizer recommendation.
17
Challenges
1. Farmers are encouraged to seek service with the service provider at a fee of Shs50,000
per sample scanned. The reason farmers are advised to access hire service was that
the scanner was a little costly. The hardware scanner costs about 3,000 euros (about
Shs13m).
2. Scanning a Soi sample takes 10 Minutes and this could affect efficiency in case sampling
has to be done a wide piece of land
SoilCare can only be operated by a professional and was very costly for farmers [4]. this,
therefore, was not very applicable to rural farmers who can’t afford the high cost of soil
sample testing, they require a simple tool they can use themselves and still achieve the same
results of testing their soil samples and knowing the right crops to plant.
[15]This project creates a model that can predict soil series with land type and according to
prediction it can suggest suitable crops. It makes use of machine learning algorithms such
as weighted K-Nearest Neighbor (KNN), Bagged Trees, and Gaussian kernel-based Support
Vector Machines (SVM) to classify the soil series. The Soil classification philosophies are
used to follow the existing knowledge and practical circumstances. On the land surfaces of
the earth, the classification of soil creates a link between soil samples and various kinds of
natural entities. Based on these classifications and the mapped data, suitable crops were
suggested for a particular region.
System overview The system uses the soil series data obtained by Soil Resources Develop-
ment Institute (SRDI) of Bangladesh. The group of soils that are formed from the same kind
of parent materials and remain under similar conditions of drainage, vegetation time, and
climate forms the soil series. It also has the same patterns of soil horizons with differentiat-
ing properties. Each soil series were named based on its locality. The main purpose of this
system was to create a suitable model for classifying various kinds of soil series data along
with suitable crops suggestion for certain regions of Bangladesh. In this paper, they have
worked on 9 soil series datasets obtained from six Upazila of Khulna district, Bangladesh.
The dataset considered had 383 samples with 11 classes. The crop database was created
18
by considering the Upazilla codes, map units, and class labels. The method involved two
phases: the training phase and the testing phase. Two datasets were used: The soil dataset
and the crop dataset. The soil dataset contains class labeled chemical features of the soil.
The machine learning methods were used to find the soil class. Three different methods
used were: weighted K-NN, Gaussian Kernel-based SVM, and Bagged Tree.
Weighted KNN:KNN uses all training data since weighting wasused. The nearest neigh-
bors are given more weightage than the farther ones. The distance is calculated and the
majority of such classification wastaken as the final classification. The obtained accuracy of
the classification using KNN was 92.93 %
Gaussian Kernel-based SVM: SVM separates the objects of classes into different de-
cision planes. A decision boundary separates the objects of one class from the object of
another class. Support vectors are the data points that are nearest to the hyper-plane. Ker-
nel function separates the non-linear data by transforming the inputs to higher-dimensional
space. In this project, they have used the Gaussian kernel function. The accuracy obtained
using SVM was 94.95%
Bagged Trees: Bagging generates a set of models that classifies the random samples of
data and predicts the classes. When given a new instance, the predictions of these models
are aggregated to find the final prediction of class. The accuracy obtained using this algo-
rithm was 90.91%.
After classifying the soil series, the crops, that are suitable for that series for the given
map unit of the corresponding Upazila were suggested. The results inferred that K-NN and
Bagged tree showed comparative accuracy, but SVM outperforms the other two algorithms
and produces an accuracy of 94.95%. The system was able to achieve an average accuracy
of 92.93%.
2.2.3 Big Data Analytics for Crop Prediction Model Using Opti-
mization Technique
Big data analysis was used to discover novel solutions which means analyzing bulky data
set, so which place a decision making in an agricultural field. [17] In this project they were
predicting two classes that are good yield and bad yield based on soil and environmental fea-
tures like average temperature, average humidity, total rainfall and production yield hybrid
19
classifier model was used in optimizing the features and itwasalso divided into three phases
that are viz pre-processing, feature selection and SVM GWO was a grey wolf optimizer
along with Support Vector Machine(SVM) classification was used to improve the accuracy,
precision, recall, and F-measure. In this the result shows that the SVM GWO approach was
better as compared to typical SVMs classification algorithm.
The proposed methodology of GWO (Grey Wolf Optimization over SVM (Support Vector
Machine) classifier. The proposed approach was divided into three phase.
Data Pre-Processing: Data pre-processing was deemed to be the main step in the data
mining method and machine learning projects. Ineffective data collection can subsequently
lead to improper combination, weak control and incorrect values. It moreover provides mis-
leading results if the identification of the data was not carried out clearly. Therefore, it was
essentially important to carry over quality and representation of data at the initial stage.
Feature Selection: The algorithm of feature selections used to select a subset of the fea-
tures that are considered be important so that the redundant data can be removed. In
addition, it also eliminates the irrelevant and noisy features of the data to make it more
simple and accurate. It also later helps in saving memory and storage. Feature selection
algorithms are categories viz in two categories subset selection methods and feature ranking
methods. The selection methods of Subset essentially find the features that are required for
the finest subset.
Grew Wolf Optimization Based Feature Selection: GWO select optimal feature sub-
set for classification, in this the approach they used optimization for given relative weight
to features according to its relative information and reducing training error. So grey wolf
optimization provides effective relation between features and get effective information from
features by this information reduce the support vectors in SVM which reduce the training
error of the classifier. So it’s the direct impact on testing an error of classification and
increases the accuracy, precision and recall.
Machine Learning Classifier: A feature selection the algorithm selects a subset of vital
features and removes superfluous, unrelated, and raucous features for simpler and more pre-
cise data illustration.
Proposed Algorithms: Proposed Algorithm of SVM GWO. The objective of this study
was to increase the accuracy of prediction model by using different parameters for future
precision agriculture. The data set has been taken from Food agriculture organizations and
applied to the processing model through map-reduce. To process the big data, map-reduce
20
was used to minimize the time of execution and further classification was done. Feature
selection and extraction are important steps in classification systems.
This paper presents a hybrid model i.e. SVM GWO that uses a combinational approach for
improving the classification accuracy, recall, precision, f-measure by selecting the optimal
parameters settings in SVM. In this the classification we have extracted the feature vector
with minimum error and converge and then SVM GWO is developed for selecting the op-
timal SVM parameters. The result shows that the proposed approach was better than the
typical SVM classification algorithm with classification accuracy 77.09%, precision 75.38%,
recall 74.189% and f measure 73.15%.
The ensembling technique was used to build a model that combines the predictions of mul-
tiple machine learning models to recommend the right crop based on the soil specific type
and characteristics with high accuracy. [10] The independent base learners used in the en-
semble model are Random Forest, Naive Bayes, and Linear SVM. Each classifier provides its
own set of class labels with acceptable accuracy. The class labels of individual base learn-
ers are combined using the majority voting technique. The crop recommendation system
classifies the input soil dataset into the recommendable crop type, Kharif and Rabi. The
dataset comprises the soil-specific physical and chemical characteristics in addition to the
climatic conditions such as average rainfall and the surface temperature samples. The aver-
age classification accuracy obtained by combining the independent base learners was 99.91%.
A brief step by step procedure of designing the crop the recommendation system was ex-
plained as follows:
Step 1: Input The input dataset was a comma-separated values file containing the soil
dataset, which has to be subjected to pre-processing.
Step 2: Pre-processing of input data The input dataset was subject to various pre-processing
techniques such as filling of missing values, encoding of categorical data and scaling of values
in the appropriate range
Step 3: Splitting into training and testing dataset The pre-processed dataset was then split
21
into training and testing datasets based on the specified split ratio. The split ratio consid-
ered in the proposed work was 75:25, which means 75% of the dataset was used for training
the ensemble model and the rest 25% was used as a test dataset.
Step 4: Building individual classifiers on the training dataset was fed to each of the inde-
pendent base learners and the individual classifiers are built using the training dataset.
Step 5: Testing the data on each of the classifiers The testing dataset was applied to each
of the classifiers, and the individual class labels are obtained.
Step 6: Ensembling the individual classifier output using Majority Voting Technique.
Proposed Algorithm: Ensemble Framework, Random Forest, Naive Bayes, Linear SVM,
Majority Voting.
The dataset considered for usage in the given proposed work was a soil dataset primar-
ily comprising of soil physical and chemical properties, along with the climatic details. An
open-source dataset was obtained from the data repository site of the Government of India,
data.gov.in
The dataset size was 5MB containing 9000 rows and 15 attributes that are of prime impor-
tance. The crops considered are Cotton, Sugarcane, Rice, Wheat. The dataset attributes
that are of prime importance are, Soil Type, pH value of the soil, NPK content of the soil,
The porosity of the soil, Average rainfall, Surface temperature, Sowing season .
A crop recommendation system has been designed that takes into consideration the soil
dataset concerning the four crops Rice, Cotton, Sugarcane, Wheat. The soil dataset was
first preprocessed and then the ensembling technique performs a critical function in the
classification of the four crops. The individual base learners used in the ensemble model are
Random Forest, Naive Bayes, and Linear SVM. Majority Voting Technique has been used
as a combination method to provide the best accuracy.
The accuracy obtained using the ensembling technique is 99.91%. Hence, the proposed
work provides a helping hand to the farmer in the accurate selection of the crop for culti-
vation. This creates an exponential gain in the crop productivity which in turn boosts the
economy of the country.
22
2.3 Conclusion
Despite the many systems/applications that have been developed/implemented to provide
easy access to soil classification and crop suggestion as discussed above, most of them do
not fully provide/render the recommended services to farmers and researchers including
agricultural extension workers at all levels. For example, a number of them only focus on
providing services like soil management tips, irrigation guidelines, and also pest controls.
In addition, some of them involve the use of technologies such as UV lights and more
manual processes such as the USDA triangle method that are not commonly known by
most rural farmers in the developing countries thus denying them a chance to have access
to the information about their soil and recommendations of crops to be planted.
Therefore, this motivated the researcher to come up with a fully packaged digital platform
for soil classification and crop suggestion: Soilla. This mobile application provides the four
major features that are, soil classification, crop suggestion, on-demand Reports of past
analysis classification, and into detail tips on proper soil management. All the above
services are available on the mobile application that can be used even offline without an
internet connection simply and directly without the need of an expert.
23
Chapter 3: Methodology
3.1 Introduction
This section explains the criteria/methodology used by the researcher during the develop-
ment of the intended system in order to achieve the main goal of the study/project and to
answer the research questions identified in sections 1.4 and 1.6 of this report respectively.It
describes in detail all the activities carried-out at each stage. The researcher employed the
Design science research methodology as explained below.
Figure 3.1: The general steps of the Design science research methodology
24
3.2.1 Problem Awareness
The researcher used this step in order to answer objective one of this study identified in
section 1.5 of this report which is “To examine the current methods used for soil classification
and crop suggestion” since they could not devise a better solution without knowing the
problem and its magnitude. Here the researcher carried out a metadata search where he
was able to review different views about the different scholars concerning the problem.
Further more, the researcher reviewed literature about the different applications/systems
that were developed to address the same problem. Through this, the researcher was able
to identify the weaknesses and areas of improvement. As a result, this enabled them to
propose a better solution that would address the issues/weaknesses identified as explained
in the next step since he was able to realize the magnitude of the problem.
3.2.2 Suggestion
In this step, the researcher used their knowledge and skills acquired to either propose a
new solution for the problem identified or improve on the existing ones (eliminating the
weaknesses identified).This enabled the researcher to answer objective two of this study as
identified in section 1.5 of this report.This step also laid a foundation for the next phase since
a better solution for the problem at hand was devised thus fully accomplishing objective two
of this study which is “To design and develop an Image classification Model for soil texture
classification and crop suggestion and develop a mobile application based on the developed
model.”
3.2.3 Development
In this step of the design science methodology, the researcher carried out the actual work
of designing and developing the suggested system identified in the previous phase using
different requirements. This enabled the researcher to fully achieve objective two of the
study as identified in section 1.5 of this report. The outcomes of this step are explained in
detail below.
25
The system development stage
In this phase, an Image classifier model is built and added on top of a base model that is
a CNN Model (VGGNET16) pre-trained on Image-Net dataset. the whole model is trained
on phases to enable the new head model learn at the same rate as the base model. An
Android Mobile application was developed using object-oriented programming in java and
the layouts design in XML with android studio as the IDE. Visual simulation of the whole
mobile application was design and implemented in Figma.
This step consists of validating the developed application/system within its targeted envi-
ronment with an aim of finding out if it fulfills its targeted purpose/goal. Therefore, this is
part of the researcher future work since it requires some good time to fully accomplish this.
3.2.5 Conclusion
This is the last step in the design science research methodology and within this phase/step,
the researcher with the help of a few key stakeholders carried out a system testing thus
answering objective three and research question three of this study as identified in section
1.5 and 1.6 respectively.This involved testing the image classification crop suggestion model
and mobile applications using different tools both visually and using code tests. An android
smart phone was used to test the mobile application. Furthermore, after a successful testing,
it was confirmed that the developed image classifier and mobile application satisfies/meets
its main/overall goal of improving soil classification and crop suggestion in low resource
settings of Uganda.
26
Chapter 4: Presentation of Results
4.1 Introduction
This chapter focuses on interpreting the results obtained from objective one and two of the
project.objective one was about examining existing models used in various soil classifica-
tion systems. this chapter examined models such as VGG16, EfficientNet, Inception and
ResNet50. All this models were pre-trained on Imagenet Dataset and their performances
tested with the same number of epochs in training.
Figure 4.1: A figure showing the output of the training process of a VGG16 Model!
27
From Figure 4.1, we were able to achieve a validation Accuracy of 93% with just 10
epochs and without any major changes to the model. VGG16 took a long time to train
compared to other models and this would be a disadvantage when we are dealing with huge
datasets.
Inception The hallmark idea of inception network came out from the paper [19].co-
denamed ”Inception”, which was responsible for setting the new state of the art for clas-
sification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014
(ILSVRC 2014). The main hallmark of this architecture is the improved utilization of the
computing resources inside the network. This was achieved by a carefully crafted design that
allows for increasing the depth and width of the network while keeping the computational
budget constant. To optimize quality, the architectural decisions were based on the Hebbian
principle and the intuition of multi-scale processing.
Figure 4.2: A figure showing the output of the training process of a Inception Model!
From Figure 4.2, we can see that we got 96% Validation accuracy in 10 epochs. Also
noted, how this model is much faster than VGG16. Each epoch is taking around only 1/4th
the time that each epoch in VGG16.
ResNet50 ust like Inceptionv3, ResNet50 is not the first model coming from the ResNet
family. The original model was called the Residual net or ResNet and was another milestone
in the CV domain back in 2015. The main motivation behind this model was to avoid poor
accuracy as the model went on to become deeper. Additionally, if you are familiar with
Gradient Descent, you would have come across the Vanishing Gradient issue – the ResNet
model aimed to tackle this issue as well. Here is the architecture of the earliest variant:
ResNet34(ResNet50 also follows a similar technique with just more layers)
28
Figure 4.3: A figure showing the performace of a ResNet50 Model
Figure 4.3, shows the results of training the model for 10 epochs
As seen from the Figure 4.4, efficient-Net has 98% accuracy on our validation set in
only 10 epochs. so making it the best model
29
Figure 4.5: A figure showing the performance Comparison of all 4 models!
Conclusion Based on the performance and given the size of the dataset the project
required, the researcher decided to implement the base model of the soil classification and
crop suggestion tool using VGG16 because of it‘s reliability and the available documentations
that explains its implementation for a small dataset.
4.3.1 Analysis
Development of the Model took majority of the time since it involved a lot of various task
such as ;
1. Image Generation and acquisition. This involved using google photos for images of
soil samples.
2. Image cleaning and sorting. the acquired images were then sorted to get rid of du-
plicates and also removing poor quality ones to ensure the only good quality images
reminds.
3. Testing and reviewing various deep learning networks such as VGG16, and Efficient
to determine the best network that i would use for my base model.
30
4. Re-structured the VGG16 network to accommodate my new added layout on top of
its end point. the new model was trained separately before training the whole network
1. Testing on Various prototypes of the mobile application before the actual model was
implement
2. Visual inspection of the deep learning network in netron to provide information about
each layer and understand the whole network.
3. Several soil samples that were not part of the training images were tested on the app
to determine the accuracy and speed of the application in determining the soil type.
the results obtained were correct and even the soil suggested for the platform.
31
Chapter 5: System Development and
Implementation
5.1 Requirement identified
A mobile application based on the designed image classification model was developed in this
stage. It was centered on user requirements from farmers who are able to interact with the
mobile application to classify soil samples and know the right crops to plant on the given
soil sample.
Farmers Preferred a mobile application which is simple and easy to operate and one can
classify soil samples and also access information about the suitable crops for an identified
soil type. The farmers also required the mobile application to function offline in-order to
support its use in remote places where internet connection was limited.
The Soil classification and Crop suggestion Mobile App had the following requirements.
Functional requirements which defines the core functionalities of the system. the functional
requirements for the systems include;
1. It will be able to classify soil samples after accepting an image input as a parameter
either through the camera view or from the phones photo gallery.
2. It will be able to generate reports and graphs to visualize the possibility of occurance
of each soil type in a give sample image. The reports will also include links to other
external information and a warning for the image to be retaken or uploaded incase of
errors.
3. It will also enable farmers know the right crops to plant on a given soil sample through
the suggestion feature which is made accessible once a soil type has been identified in
32
a given input image.
4. It will also enable farmers to classify and get crops suggestions without the need for
internet connection or cell signal.
These are also known as the system qualities, defined how the system was suposed to provide
quality of service to the end user. The non-functional requirements for the system are
evaluated in terms of: security, reliability, failure tolerance, back up and restore. As listed
below:
1. Security: This model was intergrated into an android mobile application that required
one to authenticate access to gallery photos, camera actions such as taking photos and
phone storage.
2. Fault tolerance: The designed application has a provision ”Reload Results” where
already performed classification are re-performed to ensure consistency in results.
The soil Classification and Crop suggestion Application: Soilla is an offline mobile appli-
cation with an embedded tensorflow lite quantized model. The whole system consists of
different modules i.e the image classification module and the crop database module. these
modules all work together as a soil classification and crop suggestion model. The image
classification module is aimed at identifying types of soil from a given image and if the given
image is not soil,it returns a unknown status and asks the user to retake the picture whereas
the crop suggestion model is aimed at keeping all the list of crops and the references to the
soil types they best grow in. the crop suggestion model is only executed once the image
classification model has detected soil in an image.
33
5.2.2 Hardware Specification
The mobile platform which will support the developed mobile application “Soilla App” has
to be with the following hardware specifications to conveniently function;
To be used efficiently, the trained soil classification model will be integrated into a mobile
application called Soilla running on mobile android smart phones such that both the farmers
and agricultural workers can use their smart phones to classify and know the exact crops
to plant on their given soil sample. Most software defines two sets of system requirements:
minimum and recommended. The application’s minimum requirements include;
1. A smart phone
2. An android platform for example android studio as the operating system 3.3 and above
However, with the increasing demand for higher processing power and resources in newer
versions of the software, it is recommended to use android API level 26 which supports full
functionality of the system services.
1. Atleast 2G network
2. API level 26
34
5.3 Model Design
The VGG16 deep convolutional Neural Network pre-trained on Image-net Dataset was used
as the base model and fine-tuning was performed on this base model to generate the desired
image classification model. Fine tuning required updating the CNN architecture and also
re-train the model to learn new object classes.
Fine-tunning involved the following process:
1. Remove the fully connected nodes at the end of the network (i.e.,from a pre-trained
CNN typically VGG).
2. Replace the fully connected nodes (head of the VGG) with freshly initialized ones.
3. Freeze earlier CONV layers earlier in the network (ensuring that any previous robust
features learned by the CNN are not destroyed).
5. Optionally unfreeze some/all of the CONV layers in the network and perform a second
pass of training.
35
Figure 5.1: Left: The original VGG16 network architecture. Middle: Removing the FC
layers from VGG16 and treating the final POOL layer as a feature extractor. Right: Re-
moving the original FC Layers and replacing them with a brand new FC head. These FC
layers can then be fine-tuned to a specific dataset (the old FC Layers are no longer used).
Figure 5.2: Left: When we start the fine-tuning process, we freeze all CONV layers in the
network and only allow the gradient to back-propagate through the FC layers. Doing this
allows our network to “warm up”. Right: After the FC layers have had a chance to warm
up, we may choose to unfreeze all or some of the layers earlier in the network and allow each
of them to be fine-tuned as well.
The model was designed initially in python with Py-charm as the python IDE and conda
36
for creating virtual for environments.
The soil dataset used to train the model consist of 500 images of soil samples for each
class that is clay, sand and loam and unknown soil with the unknown class consisting of
random images of objects which are not soil.
After fine-tuning just our newly initialized FC layer head and allowing the FC Layers to
warm up, we are obtaining 76% accuracy which is quite respectable.
Figure 5.3: Our Keras fine-tuning network is allowed to “warm up” prior to unfreezing
only the final block of CONV layers in VGG16.
After warming up the head of the model, we now unfroze the final block of the CONV
layers in VGG16 while leaving the rest of the network weights frozen. once final conv block
is unfrozen, we resume fine-tuning.
37
Figure 5.4: We have unfrozen the final CONV block and resumed fine-tuning with Keras
and deep learning. Training and validation loss are starting to divide indicating the start of
over-fitting, so fine-tuning stops at epoch 20.
The researcher did not train past epoch 20 for fear of over-fitting. at the end of the
fine-tuning process, we obtained 88% accuracy, a significant increase from thust fine-tuning
the FC layer heads alone.
The fully trained model was finally converted to a quantizied tensorflowlite model by
following tensorflow guidlines and documentaton on the main website for tensorflow. Using
the Netron Application, i am able to visualize my trained model as shown below;
38
Figure 5.5: visual view of the model in Netron
39
the mobile application. this was done to enable offline performance for farmers in remote
places. Having the model in the phone also allowed for quick process time in classifying soil
samples and getting crop suggestions.
The layout of the Android Application was design in Figma and later on transformed
into the app using XML and some custom fonts and icons.
A green color was chosen to be the primary color of the mobile application since agri-
culture is always associated with green elements.
40
5.4.1 Soilla App User Interfaces
41
Figure 5.8: Mobile analytics Screen
42
Figure 5.9: Mobile Classification Screen
43
Figure 5.10: Mobile Graph Screen
44
Figure 5.11: Mobile Crop suggesstion View
45
Figure 5.12: Mobile guide screen
46
Chapter 6: Summary, Conclusion, and
Recommendation
This chapter clearly illustrates a summarized version of the developed systems, recommen-
dations while interacting with the developed system plus areas for further study as clearly
explained below;
6.0.1 Conclusion
The Soil Classification and Crop suggestion model and Soilla App are able to classify soil
samples correctly and also offer a suggestion for crops to be planted on the given soil.
This will create certainty in crop yields and increase in income-earning for farmers, thus
ensuring that agricultural investments increase.
6.0.2 Contribution
The developed android mobile-based application ”Soilla App” enables Farmers to know the
right crops to plant on a given soil before they can venture into its production for business
which in turn can reduce investment losses thus achieving the general objective.
Farmers and agricultural officials can now be able to both participate in classifying soil
samples within their region and helping build a national database of soil samples and their
properties with suitable crops that are supported.
1. Inadequate datasets limited my research and also the existing datasets were not com-
plete. This led to increasing errors in my calculations.
2. There was a financial challenge to facilitate the purchase of research data for training
the model.
47
6.0.4 Requirements for the adoption of Soilla App
1. Network reception in the areas where the application will be used. The network should
be at least 2G
2. The mobile application requires an android smartphone. The phone can be offline or
even online and the app will still function efficiently.
3. Good camera with clear pixles for better quality images of Soil samples.
6.0.5 Recommendation
The soil classification and crop suggestion application should be adopted because it is very
suitable to solve Uganda‘s challenges of soil mapping and pro-filling. farmers need the right
tools to know their soil types and also the right crops to plant in them. It can be adopted
by considering the following:
1. Market-ting the application on mass media and online platforms should be done for
those who venture and depend on agriculture as a source of lively-hood.
2. Provide the product (the Soilla App) on Google play for download and usage store so
that potential users who would like to classify soil and know suitable crops can easily
acquire it.
3. Areas for Further Study include detailed research into other parameters that determine
yield production and archiving the findings in an open database.
4. The Soilla App is not specific to Soilla Texture only but is concerned with generally
improving crop yields, thus further study is required to apply the model to the clas-
sification of crops and determining infected ones from the healthy ones for crops such
as maize, cassava and many more.
5. Extended the application to non-smartphone users through the use of USSD since
farmers tend to prefer ”feature phones” that is, phones without an operating system
(Azam and Zawawi, 2012).
48
6.0.6 Time Scope of the Project
Figure 6.1: A figure showing the Time scope of project development phase!
49
Bibliography
[1] U.S. DEPARTMENT OF AGRICULTURE. Soil Classification. https://www.nrcs.
usda.gov/wps/portal/nrcs/detail/soils/survey/?cid=nrcs142p2_054167/,
2021. [Online; accessed 11-Nov-2021].
[2] Utpal Barman and Ridip Dev Choudhury. Soil texture classification using multi class
support vector machine. Information processing in agriculture, 7(2):318–332, 2020.
[3] Aline Dresch, Daniel Pacheco Lacerda, and José Antônio Valle Antunes. Design science
research. In Design science research, pages 67–102. Springer, 2015.
[5] FAO and UBOS. FAO and UBOS launch the Annual Agriculture Survey report, pro-
viding key statistics for the agricultural sector. https://www.fao.org/uganda/news/
detail-events/en/c/1293240//, 2021. [Online; accessed 11-Nov-2021].
[7] Food and Agricultrual Organization of the United Nations. The state of food security
and nutrition in the world 2021. 2020.
[9] Global Hunger Index. Uganda A Closer Look at Hunger and Undernutrition. https:
//www.globalhungerindex.org/case-studies/2018-uganda.html/, 2021. [Online;
accessed 11-Nov-2021].
[10] Nidhi H Kulkarni, GN Srinivasan, BM Sagar, and NK Cauvery. Improving crop pro-
ductivity through a crop recommendation system using ensembling technique. In 2018
50
3rd International Conference on Computational Systems and Information Technology
for Sustainable Solutions (CSITSS), pages 114–119. IEEE, 2018.
[11] Qing Li, Fenlan Gu, Yong Zhou, Tao Xu, Li Wang, Qian Zuo, Liang Xiao, Jingyi Liu,
and Yang Tian. Changes in the impacts of topographic factors, soil texture, and crop-
ping systems on topsoil chemical properties in the mountainous areas of the subtropical
monsoon region from 2007 to 2017: A case study in hefeng, china. International Journal
of Environmental Research and Public Health, 18(2):832, 2021.
[12] Tandzi Ngoune Liliane and Mutengwa Shelton Charles. Factors Affecting Yield of
Crops. https://www.intechopen.com/chapters/70658, 2021. [Online; accessed 11-
Nov-2021].
[13] Mark D Nelson, Kurt H Riitters, John W Coulston, Grant M Domke, Eric J Greenfield,
Linda L Langner, David J Nowak, Claire B O’Dea, Sonja N Oswalt, Matthew C Reeves,
et al. Defining the united states land base: a technical document supporting the usda
forest service 2020 rpa assessment. Gen. Tech. Rep. NRS-191., 191:1–70, 2020.
[14] Thereza Patricia Pereira Padilha and Lucas Estanislau Alves de Lucena. A systematic
review about use of tensorflow for image classification and word embedding in the brazil-
ian context. Academic Journal on Computing, Engineering and Applied Mathematics,
1(2):24–27, 2020.
[15] Sk Al Zaminur Rahman, Kaushik Chandra Mitra, and SM Mohidul Islam. Soil classi-
fication using machine learning methods and crop suggestion based on soil series. In
2018 21st International Conference of Computer and Information Technology (ICCIT),
pages 1–4. IEEE, 2018.
[16] Kateřina Sedláčková and Lenka Ševelová. Comparison of laser diffraction method and
hydrometer method for soil particle size distribution analysis. Acta Horticulturae et
Regiotecturae, 24(1):49–55, 2021.
[17] Shivi Sharma, Geetanjali Rathee, and Hemraj Saini. Big data analytics for crop pre-
diction mode using optimization technique. In 2018 Fifth International Conference on
Parallel, Distributed and Grid Computing (PDGC), pages 760–764. IEEE, 2018.
51
[19] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir
Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper
with convolutions. In Proceedings of the IEEE conference on computer vision and
pattern recognition, pages 1–9, 2015.
[20] VGG16. VGG16 – Convolutional Network for Classification and Detection. https://
neurohive.io/en/popular-networks/vgg16/, 2021. [Online; accessed 11-Nov-2021].
[21] Jan vom Brocke, Alan Hevner, and Alexander Maedche. Introduction to design science
research. In Design Science Research. Cases, pages 1–13. Springer, 2020.
[22] Heather Winskel, Tae-Hoon Kim, Lauren Kardash, and Ivanka Belic. Smartphone use
and study behavior: A korean and australian comparison. Heliyon, 5(7):e02158, 2019.
52