Machine Learning Pneumonia Prediction
Machine Learning Pneumonia Prediction
BY
20/47cs/01355
AUGUST 2024
Automatic Detection and Analyze of Pneumonia X-ray and
COVID-19
BY
August, 2024
ii
DECLARATION
I hereby declare that this research work titled “Automatic detection and analyze of
pneumonia X-rays and COVID-19” is my own work and has not been submitted by
any other person for any degree or qualification at any higher institution. I also declare
that the information provided therein are mine and those that are not mine are properly
acknowledged.
__________________________
________________________
iii
CERTIFICATION
This is to certify that the research project titled “Automatic detection and analyze of
pneumonia X-rays and COVID-19” was carried out by “Saliu Fuhid Olakunle”. The
project has been read and approved as meeting the requirements for the award of
Bachelor of Science (B.Sc.) Degree in Computer Science in the Department of Computer
Science, Faculty of Information and Communication Technology, Kwara State
University, Malete.
______________________ ___________________
Supervisor
_______________________ ____________________
Head of Department
_______________________ _____________________
iv
DEDICATION
This Project is dedicated to GOD Almighty, the beginning and the end who has been with
me since my birth till the moment and also, to my dad and my mum, my friends and
supervisor for their supports, guidance and prayers.
v
ACKNOWLEDGEMENT
All praise and adoration belong to God for his mercy and protection over me throughout
my program in the university.
I acknowledge the efforts of my dad (Mr. Saliu Lukman) and my mum (Mrs. Saliu
Sherifat), may God spare their life to reap the reward of her labor. My sincere
appreciation also goes to my loving and caring siblings starting from Saliu Zikrulahi for
his leadership role and Saliu Olamilekan for his courageous words towards the success of
this program, and thanks to entire family and its community in general, May God reward
them all abundantly. Furthermore, I acknowledge the support of my friends from (Alabi
Daniel, Alabi Opeyemi, Balogun Al-ameen and Badmus Ikramah). May Almighty God
be with them and crown their efforts with success.
I appreciate my colleagues in the university, my entire class mates. May God answer our
prayers and crown all our efforts with success. The school authority is also inclusive, for
creating an opportunity and avenue for us to be exposed to the outside world.
My profound gratitude goes to my supervisor, Dr. R.M. Isiaka, who did all he could to
make this report a successful one. My appreciation also goes to all lecturers in the
department.
vi
Contents
ABSTRACT.......................................................................................................................ix
CHAPTER ONE..................................................................................................................1
INTRODUCTION...........................................................................................................1
CHAPTER TWO.................................................................................................................8
LITERATURE REVIEW................................................................................................8
CHAPTER THREE...........................................................................................................23
vii
3.3.2 Kivy/Kivymd.....................................................................................................32
3.3.3 NumPy...............................................................................................................36
3.3.4 Pandas:...............................................................................................................36
CHAPTER FOUR.............................................................................................................38
Best Requirements:.....................................................................................................38
4.1 Results..................................................................................................................39
4.1.1 Interfaces...........................................................................................................39
4.2 Discussion............................................................................................................43
CHAPTER FIVE...............................................................................................................45
Summary....................................................................................................................45
Recommendations......................................................................................................46
Conclusion..................................................................................................................46
REFERENCE.............................................................................................................48
viii
ABSTRACT
This project introduces a sophisticated machine learning-based application designed for
the analysis and diagnosis of chest X-ray images, with a specific focus on detecting
tuberculosis. Leveraging a Convolutional Neural Network (CNN), the application excels
in image classification by utilizing advanced pre-processing and feature extraction
techniques. The CNN architecture comprises convolutional, pooling, and fully connected
layers to effectively capture hierarchical image features, ensuring precise and reliable
diagnostic outcomes. The application features a user-friendly interface developed using
the KivyMD framework, which incorporates Material Design principles to enhance
visual appeal and usability. This interface allows medical professionals to easily upload
and preview images, as well as receive clear, actionable results from the analysis. The
system's strengths include its high accuracy in prediction, intuitive design, and flexibility
across various hardware platforms. However, the application is limited by its
performance on lower-quality images and its dependency on more powerful hardware for
optimal functionality. To address these limitations, future improvements should focus on
expanding hardware compatibility, enhancing diagnostic capabilities beyond
tuberculosis, and refining image pre-processing methods to accommodate a broader
range of image qualities. Overall, the application represents a significant advancement in
medical image analysis, offering a powerful tool for healthcare professionals to assist in
timely and accurate diagnoses.
ix
CHAPTER ONE
INTRODUCTION
1.1 Background to study
It is common knowledge that agriculture is a vital activity for human livelihood, providing food,
feed, fibres, fuel, and raw materials. It is expected that the global population will reach 8 billion
people by 2025 and almost 10 billion by 2050 (Huang et al., 2020). This will lead to a significant
increase in the demand for countless human needs, namely food, in terms of quantity and quality.
To accommodate these needs, global food production must rise by about 60–70% (Javaid et al.,
2023).
Agriculture, also known as the “Digital Agricultural Revolution,” represents a paradigm shift in
agriculture, leveraging cutting-edge technologies to optimize various aspects of farming
operations (Liu, 2020). These technologies encompass the Internet of Things (IoT), Artificial
Intelligence (AI), Big Data, cloud computing, Decision Support Systems (DSS), advanced
sensing technology, and autonomous robots. Sensors and robotics play a crucial role in collecting
essential field data, which is then transmitted to a local or cloud server via IoT technology for
storage, processing, and analysis. Big Data and AI-based techniques can be used to convert these
data into valuable insights. To facilitate user interaction and informed decision-making, a DSS
equips users with the necessary tools to optimize the agricultural system and undertake
appropriate actions (Sharma et al., 2021).
Agriculture 4.0 generates and processes a huge volume of data that will serve as a foundation for
decision-making. It is believed that Agriculture can bring major global improvements in terms of
increasing the productivity and efficiency of agricultural and food systems, improving the
quantity, quality, and accessibility of agricultural products, adapting to climate change, reducing
food loss and waste, optimizing the use of natural resources in a sustainable way, and,
consequently, reducing the environmental impact in the years to come (Bhat & Huang, 2021).
The agricultural sector, which utilizes approximately 70% of the world’s freshwater, faces
significant challenges due to increasing water scarcity and the need for sustainable farming
practices (Faouzi et al., 2020). To address these challenges, the integration of machine learning
10
(ML) for real-time monitoring and optimization of water usage has emerged as a crucial
technological advancement. This write-up explores the development and implementation of an
ML model aimed at optimizing irrigation processes to enhance water efficiency and crop
productivity.
Machine learning techniques have proven valuable in predicting soil properties, allowing
farmers, researchers, and stakeholders to make informed decisions regarding soil fertility,
moisture levels, and nutrient concentrations (Meshram et al., 2021). By assimilating data from
various sources, machine learning models provide valuable insights into the dynamic nature of
soil behavior, allowing for proactive adjustments in farming practices to ensure optimal
conditions for crop growth and yield. Additionally, through the application of computer vision
and remote sensing data, ML simplifies the monitoring of both crops and soil conditions by
offering timely information on crop health, growth stages, and potential stressors (Wang et al.,
2022).
The agricultural sector faces significant challenges in optimizing water usage due to factors such
as inefficient traditional irrigation methods, climate variability, and the increasing demand for
sustainable farming practices (Veeragandham & Santhi, 2020). Traditional irrigation systems
often result in substantial water wastage and fail to adapt to the dynamic water needs of crops
influenced by environmental factors such as weather and soil conditions. The lack of real-time
monitoring and precise control mechanisms exacerbates these inefficiencies, leading to lower
water-use efficiency and reduced agricultural productivity. Therefore, there is a critical need to
11
develop a machine learning-based model that enables real-time monitoring and optimization of
water usage in agricultural practices (Jin et al., 2020).
12
and crop yield. This is crucial given the growing global demand for food and the need for
sustainable agricultural practices amidst climate variability. Additionally, the study’s outcomes
can provide farmers with actionable insights and automated irrigation controls, reducing water
waste and labor efforts. The implementation of a user-friendly interface will facilitate the
adoption of these advanced technologies in everyday farming, promoting widespread benefits for
the agricultural sector and contributing to environmental conservation
A subset of artificial intelligence (AI) that involves training algorithms to recognize patterns in
data and make predictions or decisions without being explicitly programmed. In agriculture, ML
can analyze data from various sources to optimize processes such as irrigation and crop
management.
A network of physical devices embedded with sensors, software, and other technologies to
connect and exchange data with other devices and systems over the internet. In agricultural
applications, IoT devices monitor environmental conditions, soil moisture, and plant health in
real-time
Precision Agriculture
A farming management concept that uses technology to observe, measure, and respond to
variability in crops. This approach aims to optimize field-level management regarding crop
farming to improve yields and resource use efficiency, including water.
Real-Time Monitoring
The continuous collection and analysis of data as it is generated. In the context of this study,
real-time monitoring involves using IoT sensors to gather and transmit data on soil moisture,
weather conditions, and crop status to support immediate decision-making
Irrigation Optimization
13
The process of adjusting irrigation practices to ensure that water is used efficiently and
effectively, minimizing waste while maximizing crop yield. This often involves using
technology to determine the optimal timing and amount of water application
User-Friendly Interface
A design characteristic of software applications that ensures they are easy to use and understand
by end-users. For this study, it refers to the mobile or web platforms that allow farmers to
interact with the ML model and control irrigation systems effortlessly.
14
CHAPTER 2:
LITERATURE REVIEW
2.1 Related concepts
Precision Agriculture:
Smart irrigation refers to the use of advanced technologies to manage and optimize the use of
water in agriculture. This system relies on sensors and automated controllers that adjust watering
schedules based on real-time data on soil moisture, weather forecasts, and crop requirements. By
precisely managing water delivery, smart irrigation systems reduce water waste, lower costs, and
improve crop health and yield (Sharma et al., 2021). These systems can include drip irrigation,
sprinkler systems, and soil moisture-based systems, which are all controlled by data-driven
decisions. Furthermore, smart irrigation contributes to sustainable water management by
ensuring that water resources are used efficiently and effectively.
The Internet of Things (IoT) in agriculture involves the use of interconnected devices that collect
and share data to enhance farming operations. IoT devices can include soil sensors, weather
stations, drones, and automated machinery. These devices collect vast amounts of data that can
be analyzed to monitor crop health, predict weather conditions, and optimize irrigation and
fertilization schedules. The connectivity provided by IoT enables real-time monitoring and
15
management, leading to more efficient and productive agricultural practices (Javaid et al., 2023).
IoT systems can also facilitate remote management, allowing farmers to control equipment and
systems from a distance, improving operational efficiency and response times.
Machine Learning:
Machine learning is a branch of artificial intelligence that focuses on building systems that can
learn from and make decisions based on data. In agriculture, machine learning algorithms
analyze data from various sources such as soil sensors, weather stations, and crop images to
predict outcomes and optimize farming practices. For example, machine learning can help in
predicting the best times to water crops, identifying disease outbreaks early, and improving crop
yields by recommending the best agricultural practices based on historical data (Meshram et al.,
2021). These predictive capabilities can lead to more proactive and efficient farm management,
reducing risks and enhancing productivity.
Soil moisture sensors are devices used to measure the water content in the soil. These sensors
provide critical data for irrigation management, helping farmers to apply the right amount of
water at the right time. Soil moisture sensors can be placed at various depths to get a
comprehensive understanding of soil moisture levels throughout the root zone. This information
helps in preventing over-irrigation or under-irrigation, both of which can harm crop health and
reduce yields. The data collected can be transmitted to a central system where it is analyzed and
used to make informed irrigation decisions, ensuring optimal water use efficiency (Wang et al.,
2022).
Remote Sensing:
Remote sensing involves collecting information about an object or area from a distance, typically
using satellite or aerial imagery. In agriculture, remote sensing is used to monitor crop health,
soil conditions, and environmental changes. Technologies such as drones equipped with cameras
and sensors, satellites with multispectral imaging, and thermal sensors can provide detailed
16
images and data. This information helps farmers detect issues such as pest infestations, nutrient
deficiencies, and water stress early, allowing for timely interventions that can improve crop
health and yields. Remote sensing also supports large-scale monitoring and management, making
it possible to oversee extensive farming operations efficiently (Liu, 2020).
Data Analytics:
Data analytics in agriculture involves examining data sets to draw conclusions and make data-
driven decisions to optimize farming practices. This process includes collecting data from
various sources such as sensors, weather stations, and historical crop performance records.
Advanced analytical tools and techniques, including statistical analysis, predictive modeling, and
machine learning, are used to identify patterns and trends. Insights gained from data analytics
help farmers make better decisions regarding planting schedules, irrigation, fertilization, and pest
control, ultimately leading to more efficient and productive farming. Data analytics also supports
the development of more accurate and customized farming strategies. (Benos et al., 2021).
Sustainable Agriculture:
Sustainable agriculture focuses on farming practices that meet current agricultural needs without
compromising the ability of future generations to meet theirs. This approach emphasizes
resource conservation, environmental protection, and economic viability. Sustainable agriculture
practices include crop rotation, conservation tillage, integrated pest management, and organic
farming (Sharma et al., 2021). The goal is to create a balance between the need for food
production and the preservation of the ecological systems that support agriculture. Sustainable
practices help in maintaining soil health, reducing water usage, minimizing chemical inputs, and
enhancing biodiversity. By adopting sustainable practices, farmers can contribute to long-term
food security and environmental sustainability.
Crop health monitoring involves the continuous assessment of plant conditions to ensure optimal
growth and yield. This practice uses various technologies such as drones, satellites, and ground-
17
based sensors to gather data on plant health indicators like color, biomass, chlorophyll content,
and temperature. Techniques such as multispectral and hyperspectral imaging can detect stress
factors like diseases, nutrient deficiencies, and water stress before they become visible to the
naked eye. By identifying these issues early, farmers can take corrective actions promptly,
applying precise treatments that minimize crop loss and maximize productivity (Bhat & Huang,
2021). Additionally, machine learning models can analyze historical and real-time data to predict
potential problems and suggest preventive measures.
Climate-Smart Agriculture:
18
drones. However, the study acknowledges limitations such as the need for further research on the
scalability and cost-effectiveness of these AI technologies in diverse agricultural settings. In
conclusion, the paper emphasizes that AI technologies offer solutions to challenges in
agriculture, enhancing productivity, reducing resource wastage, and improving overall efficiency
in farming practices.
(Qazi et al., 2022) Performed a study on “IoT-Equipped and AI-Enabled Next Generation Smart
Agriculture: A Critical Review, Current Challenges and Future Trend” The study paper aims to
provide a comprehensive review of smart agriculture systems through IoT technologies and AI
techniques, discussing the importance of smart agriculture practices, current hardware building
blocks, automated control algorithms for smart irrigation, and the application of AI and DL in
smart agriculture. The methodology involves detailing advancements in smart agriculture
systems, reviewing available technologies and challenges, and discussing future trends. The
results highlight the potential of IoT and AI in revolutionizing conventional agriculture practices.
However, the limitations include challenges in widespread deployment and the need for further
research to address these obstacles. In conclusion, the paper emphasizes the critical role of IoT
and AI in shaping the future of agriculture, stressing the need for global adoption of smart
agriculture systems to overcome challenges in food demand, arable land shortage, pesticide
regulations, and water scarcity.
(Liu, 2020) “Artificial intelligence (AI) in agriculture” The research paper focuses on the
application of Artificial Intelligence (AI) in agriculture, particularly within the Agricultural
Research Service (ARS). The primary objectives were to leverage AI-based tools for site-
specific decision-making in agriculture, enhance early-warning systems for pest and disease
outbreaks, and promote sustainable cropland management practices. The methodology involved
the development of AI tools that utilize site-based science, big data, remote sensing, neural
networks, and machine learning to advance agricultural research. The results highlighted the
transformative potential of AI in revolutionizing agriculture by optimizing crop production,
resource management, and environmental sustainability. However, the paper acknowledged
limitations in the scope of projects covered and the need for continuous improvement in
technical capacity. In conclusion, the research underscores the critical role of AI in driving an
19
agricultural revolution to meet the increasing global food demand while optimizing resource
utilization and sustainability.
(Bhat and Huang, 2021) “Big Data and AI Revolution in Precision Agriculture: Survey and
Challenges” The research aims to explore the applications of big data and AI in precision
agriculture, focusing on data creation methods, technology accessibility, data analytics, and
challenges faced in implementation. The methodology involved a systematic literature review to
identify relevant studies from 2000-2020, resulting in 77 selected papers. The results highlighted
the significance of innovative machine learning techniques like CNN in processing vast,
heterogeneous agricultural data for improved decision-making. However, limitations include the
complexity of managing unstructured data and the need for advanced real-time data handling
platforms. In conclusion, the study emphasizes the transformative potential of big data and AI in
precision farming, offering opportunities for enhanced decision-making and addressing evolving
agricultural production challenges through scalable learning methods.
20
(Faouzi et al., 2020), did a research “Wastewater reuse in agriculture sector: Resources
management and adaptation in the Context of climate change: case study of the Beni Mellal-
Khenifra region, Morocco” The research aimed to evaluate the efficiency of wastewater
treatment plants (WWTP) in the Beni Mellal-Khenifra region, focusing on physicochemical and
biological parameters, as well as vegetation cover evolution using satellite images. The
methodology involved assessing six WWTP based on water quality, conducting surveys with
farmers and residents, and analyzing satellite images to determine the impact on vegetation
cover. Results indicated that treated wastewater met Moroccan standards for reuse in irrigation,
with Boujaad WWTP standing out as a model. Limitations included variations in treated
wastewater quality among different WWTPs, such as high COD levels in some plants. Despite
limitations, the study concluded that wastewater reuse in agriculture can help secure irrigation in
the region, ensuring water availability and quality amidst climate change challenges.
(Zhai et al., 2020), “Decision support systems for agriculture 4.0: Survey and challenges” The
research paper aims to explore the challenges of employing agricultural decision support systems
in Agriculture 4.0 by conducting a systematic literature review of thirteen representative decision
support systems. The methodology involves analyzing each system in terms of interoperability,
scalability, accessibility, and usability. The results highlight seven upcoming challenges, such as
the need to simplify graphical user interfaces, enrich functionalities, and adapt to uncertainty.
The limitations include the potential for inaccurate decision support due to the complexity of
agricultural problems. In conclusion, the study emphasizes the importance of overcoming these
challenges to enhance the development and effectiveness of agricultural decision support
systems in Agriculture 4.0, ultimately improving decision-making processes for farmers and
contributing to higher productivity and sustainability in agriculture.
21
empowerment, and food security. Limitations included the lack of consistent reporting on FO
impacts and the potential influence of external factors on outcomes. In conclusion, the study
highlighted the diverse roles of FOs in enhancing smallholder agriculture but emphasized the
need for more rigorous research to understand their full impact and address existing limitations.
(Huang et al., 2020), “Water-saving agriculture can deliver deep water cuts for China” The
research aimed to assess the impact of on-farm water management interventions on water
consumption reductions and maize production in China. The study utilized the AquaCrop model
to simulate various scenarios of integrated water management interventions, comparing results
with previous studies for validation. The findings indicated that interventions like improved
irrigation and soil management practices could lead to a substantial reduction in water
consumption nationally, particularly in water-stressed regions like the North China Plain and
Northeast China. These interventions also showed potential to increase maize production,
contributing significantly to meeting future demand. However, the study acknowledged
limitations such as assumptions of full irrigation setup and potential overestimation of water
consumption cuts in certain regions. In conclusion, the research highlighted the importance of
on-farm water management interventions in achieving Sustainable Development Goals related to
water, land, and food security in China and beyond.
“Coping with salinity in irrigated agriculture: Crop evapotranspiration and water management
issues” The research paper aims to address the challenges of soil and water salinity in irrigated
agriculture by focusing on strategies to understand the impacts of salinity on soil water balances
and evapotranspiration (ET) for optimal water management. The study utilizes the FAO56
framework to compute water requirements in saline environments, incorporating stress
coefficients to adjust crop coefficients. By applying both steady state and transient models, the
research provides insights into salinity effects on crop growth and irrigation scheduling. The
methodology involves modeling salinity build-up in the root zone and discussing soil-crop-water
management interventions for maintaining crop growth under saline conditions. The results
highlight the importance of adequate irrigation methods, cyclic use of multi-salinity waters, and
proper irrigation scheduling to mitigate salinity effects. However, the study acknowledges
limitations in the disposal of saline drainage water and the complexity of salinity impacts
22
influenced by various environmental and management factors. In conclusion, the research
underscores the significance of tailored irrigation strategies and water management practices to
cope with salinity challenges in irrigated agriculture.
(veeragandham and santhi, 2020) “A review on the role of machine learning in agriculture” The
research paper aims to review the role of machine learning in agriculture by analyzing various
machine learning approaches used in the past five years, highlighting their advantages and
disadvantages. The methodology involves a comprehensive literature survey to gather
information on the application of machine learning in agriculture, focusing on areas such as
topsoil management, disease detection, yield prediction, and species management. The results
indicate that machine learning models have significantly contributed to increasing productivity
and improving soil classification, disease detection, water management, yield prediction, crop
quality, and weed detection in agriculture. However, the limitations of the study include the
challenges associated with data collection, cost implications, and the need for further research to
address specific agricultural issues. In conclusion, the paper emphasizes the importance of
machine learning in revolutionizing agriculture by enabling faster and more optimal decision-
making processes, ultimately leading to enhanced agricultural practices and productivity.
23
(Steinfeld et al., 2020) “The human dimension of water availability: Influence of management
rules on water supply for irrigated agriculture and the environment” The research aimed to
investigate the influence of management rules on water allocations for irrigated agriculture and
the environment in the Gwydir and Macquarie Rivers of the Murray-Darling Basin, Australia.
The methodology involved using hydrological simulation models and regression-based
sensitivity analyses to compare the impacts of water management decisions, climate, and river
system characteristics on regulated and unregulated water allocations. The results indicated that
management decisions significantly influenced regulated water allocations more than
unregulated ones, with changing management rules potentially varying long-term water
allocations. However, the study had limitations such as not examining changes in the magnitude
or distribution of future climate drivers. In conclusion, the research emphasized the importance
of transparent and systematic approaches to justify water management rules for maximizing
benefits to water users and river health in a variable and changing climate.
(Saad et al., 2020) “Water Management in Agriculture: A Survey on Current Challenges and
Technological Solutions” The research paper aims to survey recent works on water management
in agriculture, focusing on challenges and technological solutions. The methodology involves
reviewing existing literature on water usage in agriculture, including topics like water pollution,
irrigation, reuse, leaks in pipelines, and livestock drinking water. The results highlight the
importance of advanced technologies such as the Internet of Things (IoT), Wireless Sensor
Network (WSN), and cloud computing in enhancing water exploitation and management
efficiency. However, the paper acknowledges limitations in the literature, particularly the
marginal investigation of challenges related to livestock drinking water. In conclusion, the study
emphasizes the need for future research to propose innovative smart concepts and tools for
efficient water management in agriculture, building on the advancements made with modern
technologies.
(Meshram et al., 2021), “Machine learning in agriculture domain: A state-of-art survey” The
research paper aims to conduct an extensive survey on the application of machine learning in
agriculture, specifically focusing on pre-harvesting, harvesting, and post-harvesting stages to
alleviate farming problems. The methodology involves reviewing various machine learning
algorithms used in agriculture, such as K-Means clustering, ANN, and SVM, to enhance disease
24
detection and classification rates in crops. The results indicate that machine learning
technologies have significantly improved outcomes in agriculture, aiding farmers in reducing
losses and enhancing productivity. However, the study acknowledges limitations such as the
need for standard experimental methods, dataset creation, and sharing for validation by other
researchers. In conclusion, the paper emphasizes the importance of following the machine
learning pipeline, creating datasets, and sharing knowledge to benefit the agriculture sector and
support future research endeavors.
(Wang et al., 2022), “A Review of Deep Learning in Multiscale Agricultural Sensing” The
research paper aims to review the application of deep learning in multiscale agricultural sensing,
focusing on convolutional neural network-based supervised learning (CNN-SL), transfer learning
(TL), and few-shot learning (FSL) in crop sensing at various scales. The methodology involved a
comprehensive investigation of typical studies utilizing CNN-SL, TL, and FSL in agricultural
sensing, particularly at leaf, canopy, field, and land scales. Results highlighted the effectiveness
of deep learning models in tasks such as crop classification, disease detection, and pest
recognition, showcasing advancements in accuracy and model robustness. However, limitations
were identified, including data specificity, small dataset sizes, and computational capacity
constraints hindering real-time applications. In conclusion, the study emphasizes the potential of
deep learning to revolutionize precision agriculture by enabling informed decision-making based
on high-resolution farmland imagery, while also acknowledging the need for further research to
address current challenges and propel the evolution of modern agriculture.
(Benos et al., 2021) “Machine Learning in Agriculture: A Comprehensive Updated Review” The
research aimed to explore the application of machine learning (ML) in agriculture by reviewing
recent literature focusing on crop, water, soil, and livestock management. The study utilized
PRISMA guidelines to select journal papers published between 2018-2020, revealing a
multidisciplinary approach with a spotlight on crop management. Various ML algorithms were
employed, with Artificial Neural Networks proving more effective, particularly in analyzing
maize, wheat, cattle, and sheep data. The use of sensors on satellites and unmanned vehicles
provided reliable input data for analysis. However, the study's limitations may include a potential
lack of generalizability due to the specific timeframe and focus on journal articles. In conclusion,
the research underscores the significant potential of ML in agriculture, emphasizing the need for
25
systematic exploration and awareness among stakeholders to enhance farming practices and
contribute to future research in this domain.
(Zhang et al., 2020) “Applications of Deep Learning for Dense Scenes Analysis in Agriculture:
A Review” The research aims to explore the applications of Deep Learning (DL) in the analysis
of dense scenes in agriculture, addressing challenges such as severe occlusions and small object
sizes that complicate such analyses. Methodologically, the review first describes the types of
dense agricultural scenes and their specific challenges, then introduces various popular deep
neural networks employed in these scenarios. It comprehensively covers how these neural
networks are applied to agricultural tasks like recognition, classification, detection, counting, and
yield estimation. The results highlight the effectiveness of DL in handling dense agricultural
scenes, though it also identifies limitations and suggests directions for future research.
Conclusively, while DL demonstrates significant promise in improving agricultural scene
analysis, ongoing advancements are necessary to overcome existing limitations and enhance
performance further.
26
(Jin et al., 2020) “Deep Learning Predictor for Sustainable Precision Agriculture Based on
Internet of Things System” This study aimed to enhance weather prediction performance in
precision agriculture IoT systems by using a deep learning predictor with a sequential two-level
decomposition structure. The complex nonlinear relationships and multiple components in
weather data make accurate predictions challenging. To address this, the weather data were
decomposed into four components serially, and gated recurrent unit (GRU) networks were
trained as sub-predictors for each component. These individual predictions were then combined
to produce medium- and long-term forecasts. Experiments conducted using weather data from an
IoT system in Ningxia, China, for wolfberry planting demonstrated that the proposed model
could accurately predict temperature and humidity, thereby supporting the planning and control
needs of sustainable agricultural production.
27
CHAPTER THREE
28
Figure 3.0 crop water requirement dataset
Data loading
Data loading is the initial step in the data processing pipeline where raw data is imported into the
system from various sources. This data can come in different formats such as CSV files, Excel
sheets, databases, or even real-time streams from IoT devices like water flow meters. The main
objective during data loading is to ensure that the data is accurately and efficiently transferred
into the working environment, typically a data frame in Python using libraries like Pandas. This
step is crucial as it lays the foundation for all subsequent processing tasks. If the data loading
process is not handled correctly, it can lead to issues such as missing data points, misaligned
rows, or even complete data loss.
29
During this phase, it’s important to account for the quality and format of the data being loaded.
For instance, if data comes from multiple sources, ensuring that the schema (e.g., column names,
data types) is consistent across these sources is vital. Often, this step might involve converting
the raw data into a more standardized format or performing initial checks for data integrity, such
as verifying that all expected fields are present. In cases where data is being loaded from large
files or databases, optimizing the loading process through techniques like chunking (loading data
in smaller segments) can help manage memory usage and improve performance.
Data wrangling
Once the data is successfully loaded, data wrangling is the next critical step in the processing
pipeline. Data wrangling, also known as data cleaning or preprocessing, involves transforming
30
raw data into a more useful and structured format. This stage addresses issues such as missing
values, outliers, and inconsistencies in the data. For example, if certain data entries are missing,
strategies like imputation (filling in missing values based on statistical methods) or deletion of
incomplete records might be employed. Similarly, outliers that could distort analysis are either
corrected or removed, depending on the context.
Beyond cleaning, data wrangling also involves transforming the data to make it suitable for
analysis. This might include normalizing or scaling numerical values, encoding categorical
variables, and creating new features that could enhance the predictive power of the model. For
instance, in the context of water usage optimization, a feature like "time of day" might be derived
from a timestamp to analyze usage patterns more effectively. The goal of data wrangling is to
prepare a high-quality dataset that is free from errors and ready for the next steps of analysis and
modeling. Properly wrangled data not only improves the accuracy of models but also ensures
that the insights derived from the data are reliable and actionable.
31
Data analysis
Data analysis is a pivotal step where the processed data is explored and examined to extract
meaningful insights. This step typically involves both descriptive and inferential statistical
techniques. Descriptive statistics, such as mean, median, and standard deviation, help summarize
the main characteristics of the data, providing an overview of the distribution and central
tendencies. Visualization tools, such as histograms, box plots, and scatter plots, are often used to
explore relationships between variables, identify trends, and detect any anomalies in the data.
In a water usage optimization project, data analysis might focus on understanding patterns such
as peak usage times, seasonal variations, or correlations between water usage and external
factors like weather conditions. For example, a correlation analysis might reveal a strong
relationship between high temperatures and increased water consumption, which could be crucial
for predicting future water needs. The insights gained during this analysis phase guide the feature
32
selection and model-building processes, ensuring that the most relevant variables are used to
optimize water usage effectively.
33
Figure 3.6 one hot encoding 1
The main advantage of one-hot encoding is that it allows the model to treat categorical variables
as independent entities, preventing any unintended ordinal relationships between categories.
However, one must be cautious when applying this technique, especially when dealing with a
large number of categories, as it can lead to a significant increase in the dimensionality of the
dataset. In water usage optimization, one-hot encoding might be applied to variables like
"Region" or "Soil Type", ensuring that these categorical factors are appropriately represented in
the model. By accurately encoding categorical variables, one-hot encoding enhances the model’s
ability to learn from the data and make accurate predictions.
34
Data correlation
Data correlation analysis is a key step in understanding the relationships between different
variables within a dataset. Correlation measures how strongly two variables are related to each
other, with values ranging from -1 (perfect negative correlation) to +1 (perfect positive
correlation). A correlation close to 0 indicates no relationship between the variables. In the
context of water usage optimization, correlation analysis might reveal, for instance, how strongly
water consumption is related to factors like temperature, humidity, or the day of the week.
Identifying these relationships is crucial for feature selection, as it helps in choosing the most
predictive variables for the model.
35
One of the most commonly used methods for correlation analysis is Pearson’s correlation
coefficient, which measures linear relationships between variables. However, depending on the
nature of the data, other methods such as Spearman’s rank correlation might be more
appropriate, especially when dealing with non-linear relationships. By visualizing the correlation
matrix, which displays correlation coefficients between all pairs of variables, data scientists can
quickly identify multicollinearity issues—where two or more variables are highly correlated with
each other—potentially leading to model overfitting. Addressing these issues might involve
removing or combining correlated features to improve model performance. Overall, correlation
analysis is essential for refining the dataset and ensuring that the most informative features are
used in the water usage optimization model.
36
3.2 Random Forest Regressor machine learning model development
Data splitting
Data splitting is a critical step in the machine learning process where the dataset is divided into
separate parts for training and testing purposes. Typically, the data is split into a training set and
a testing set, with the training set used to build the model and the testing set used to evaluate its
performance. In some cases, a validation set is also used to fine-tune model parameters during
training. The most common split ratio is 80-20, where 80% of the data is used for training and
20% for testing, although this ratio can vary depending on the size of the dataset and the specific
requirements of the project.
In the context of a water usage optimization project using the Random Forest Regressor, splitting
the data ensures that the model is trained on a representative sample while being tested on unseen
data to evaluate its generalization capabilities. By keeping the training and testing sets separate,
data splitting helps prevent overfitting, where the model performs well on the training data but
fails to generalize to new, unseen data. This process is essential for creating a robust and reliable
model that can accurately predict water usage patterns in different scenarios.
37
To ensure that the split is done correctly, stratified sampling might be employed, especially when
dealing with imbalanced datasets where certain classes or ranges of data are underrepresented.
This technique ensures that the training and testing sets maintain the same distribution of classes
as the original dataset, leading to more accurate and fairer model evaluation.
Model definition
Model definition is the phase where the machine learning model is configured, and its
architecture is defined based on the problem at hand. For the water usage optimization project,
the Random Forest Regressor was chosen as the model due to its ability to handle complex
datasets with multiple features and its robustness against overfitting. Random Forest is an
ensemble learning method that builds multiple decision trees during training and outputs the
average prediction of these trees for regression tasks. The key advantage of using Random Forest
is that it reduces variance by averaging the results of different trees, leading to more stable and
accurate predictions.
During the model definition phase, several hyperparameters of the Random Forest Regressor are
set, such as the number of trees in the forest (n_estimators), the maximum depth of each tree
(max_depth), and the minimum number of samples required to split a node (min_samples_split).
These parameters control the complexity of the model and influence its performance. For
instance, increasing the number of trees generally improves the model's accuracy but also
increases computational cost. Similarly, controlling the depth of the trees can help prevent
overfitting, where the model becomes too tailored to the training data and loses its ability to
generalize.
38
In this project, the model definition also includes selecting the features that will be used as inputs
for the Random Forest Regressor. Feature selection is a crucial step that determines the
effectiveness of the model. By including only the most relevant features—such as temperature,
weather conditions, and historical water usage—the model is better equipped to make accurate
predictions. The final model is then compiled and made ready for training on the dataset.
Model evaluation
Model evaluation is the process of assessing how well the trained model performs on the testing
dataset. After training the Random Forest Regressor on the water usage data, the model’s
performance is evaluated using several metrics to determine its accuracy and reliability.
Common evaluation metrics for regression tasks include Mean Squared Error (MSE), Root Mean
Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R²). These metrics provide
insights into the model’s prediction error, variance, and overall fit to the data.
For the water usage optimization project, evaluating the model using these metrics helps
determine how closely the predicted water usage values align with the actual values. For
example, a low RMSE indicates that the model's predictions are close to the true values, while a
high R² value suggests that a significant proportion of the variance in the target variable is
39
explained by the model. During this evaluation phase, it’s also important to analyze the model's
performance across different segments of the data, such as different seasons or regions, to ensure
that it generalizes well across various conditions.
Cross-validation is another technique often used during model evaluation to ensure that the
model's performance is consistent and not dependent on a particular train-test split. In k-fold
cross-validation, the data is divided into k subsets, and the model is trained and evaluated k
times, each time using a different subset as the test set and the remaining k-1 subsets as the
training set. This process provides a more comprehensive assessment of the model’s
performance and helps in identifying any potential issues with overfitting or underfitting.
Model saving
Once the Random Forest Regressor has been trained and evaluated, the next step is to save the
model so it can be used later for making predictions without needing to retrain it. Model saving
is a crucial step in the machine learning pipeline as it allows for the preservation of the model's
state, including the learned parameters and architecture, enabling quick and easy deployment. In
Python, the model can be saved using libraries such as joblib or pickle, which serialize the model
object into a file that can be loaded back into memory when needed.
In the water usage optimization project, saving the trained model is particularly important
because it allows the optimization system to be deployed in real-time applications, such as
predicting water usage for future periods or adjusting water distribution strategies based on
40
predicted demand. By saving the model, the computational resources used during training are
conserved, and the model can be rapidly accessed and utilized in production environments.
Model testing
Model testing is the final step in the machine learning pipeline where the saved model is
deployed and tested on new, unseen data to evaluate its real-world performance. This step is
crucial as it ensures that the model, when applied outside the training environment, performs as
expected and provides accurate predictions. In the context of the water usage optimization
project, model testing involves using the Random Forest Regressor to predict water usage for a
new time period or under new conditions, and then comparing these predictions to the actual
observed data.
41
Figure 3.14 model testing
During model testing, it’s important to monitor the model's performance continuously and
evaluate it using the same metrics applied during the evaluation phase, such as RMSE and R². If
the model's performance on new data significantly deviates from its performance on the test set,
this could indicate issues such as overfitting or concept drift, where the underlying data
distribution changes over time. Addressing such issues might involve retraining the model with
more recent data or fine-tuning the model's parameters.
Model testing also provides an opportunity to assess how well the model integrates with other
system components, such as data pipelines and user interfaces. For instance, in a water
42
management system, the model's predictions might be used to trigger alerts, adjust water
distribution strategies, or provide recommendations to users. Ensuring that the model's output is
reliable and actionable in these contexts is key to the success of the water usage optimization
project.
Python's popularity in the data science community is largely due to its integration with libraries
such as Pandas for data manipulation, NumPy for numerical computations, and Scikit-learn for
machine learning tasks. These libraries offer robust functionalities for handling large datasets,
performing statistical analysis, and building predictive models, all of which are essential for
optimizing water usage. Additionally, Python's compatibility with various data formats and
databases enables seamless integration with other systems and data sources, making it a versatile
choice for end-to-end application development.
43
Figure 3.15 python code snippet 1
44
Figure 3.16 python code snippet 2
Streamlit
Streamlit is an open-source Python library that simplifies the process of creating interactive and
visually appealing web applications for data science and machine learning projects. With
Streamlit, developers can quickly transform data scripts into fully functional web apps without
needing extensive knowledge of front-end development. This ease of use makes Streamlit an
ideal tool for showcasing the results of a water usage optimization project, allowing stakeholders
to interact with the model's predictions and insights through a user-friendly interface.
One of the key features of Streamlit is its ability to automatically refresh the app whenever the
underlying code is updated, making it highly efficient for iterative development and testing. This
means that as you refine the Random Forest Regressor model or make adjustments to the data
processing pipeline, the changes are immediately reflected in the app, providing instant
feedback. Additionally, Streamlit's wide range of built-in widgets, such as sliders, buttons, and
charts, enable the creation of dynamic and interactive components, enhancing the user
experience and making complex data more accessible to non-technical users.
45
Another advantage of using Streamlit is its seamless integration with Python libraries, allowing
for the easy embedding of plots, tables, and other data visualizations directly into the app. This
capability is particularly useful for the water usage optimization project, where visual
representations of data and model predictions can help in understanding trends, identifying
patterns, and making informed decisions. By leveraging Streamlit's features, developers can
create engaging and interactive applications that effectively communicate the value of their data
science projects.
Hosting
Hosting a Streamlit app on Streamlit Cloud offers numerous advantages, especially for
developers looking to deploy their applications quickly and efficiently. Streamlit Cloud is a
platform specifically designed for deploying Streamlit apps, providing a streamlined and user-
friendly environment for hosting data science and machine learning applications. One of the
main benefits of using Streamlit Cloud is its simplicity; developers can deploy their apps directly
from a GitHub repository with minimal configuration, making it easy to share their work with
others and receive feedback.
Streamlit Cloud also provides automatic scaling, ensuring that the app can handle varying levels
of traffic without requiring manual intervention. This feature is particularly beneficial for
applications like the water usage optimization project, where the number of users might fluctuate
based on demand. By automatically scaling resources, Streamlit Cloud ensures that the app
remains responsive and available even during peak usage times. Additionally, the platform offers
built-in monitoring and logging, allowing developers to track app performance and identify any
potential issues in real-time. Its quick deploy button makes hosting very easy to do.
Another significant advantage of hosting on Streamlit Cloud is the integration with secure
authentication options, enabling developers to control access to their apps. This is crucial for
projects that involve sensitive data or require restricted access, such as those dealing with
46
proprietary models or confidential information. By hosting the water usage optimization app on
Streamlit Cloud, developers can benefit from a secure, scalable, and hassle-free deployment
environment, allowing them to focus on refining their models and delivering actionable insights
to stakeholders.
CHAPTER FOUR
Best Requirements
Stable internet connection
4.1 Results
The trained model had a mean absolute error (MAE) of 1.2947696104521447 a Mean
Squared Error (MSE): 18.324328330923336 and an R-squared score of (R²):
0.14354572227084583. The following demonstrates the final outcome of the system, its uses and
functions;
Interfaces
On entering the website, the user is greeted with a visually appealing and intuitive interface
designed to simplify the process of predicting water requirements for crops. The main page
47
prominently features a series of dropdown menus or input fields, each corresponding to a crucial
factor in determining the water needs:
Crop Type: A dropdown menu allows the user to select from a list of 15 different crop types,
including BANANA, SOYABEAN, CABBAGE, POTATO, RICE, MELON, MAIZE,
CITRUS, BEAN, WHEAT, MUSTARD, COTTON, SUGARCANE, TOMATO, and
ONION. This ensures that the user can quickly identify and select the specific crop they are
growing.
Soil Type: Another dropdown menu provides options for various soil types, such as sandy,
loamy, clay, and others. The user can choose the type that matches their field's soil
composition, which is a critical factor in water retention and usage.
Weather Condition: This field allows the user to input the current weather conditions or
select from predefined options like sunny, rainy, or windy. Weather plays a significant role in
determining the evapotranspiration rate, which directly affects water needs.
Region: A dropdown or selection box is provided for the user to choose their geographical
region. Different regions have different climate patterns, and this input helps tailor the
prediction to local conditions.
48
Figure 4.1 application’s homepage
Temperature: The user can enter the current temperature or select from a range of predefined
temperature brackets. Temperature influences the rate of water evaporation and plant
transpiration, making it a vital input for accurate predictions.
Each input field is clearly labeled and designed for ease of use, ensuring that even users with
minimal technical knowledge can easily navigate the page. Below the input fields, a prominently
displayed predict button encourages the user to submit their selections.
Once the user clicks the Predict button, the selected values are sent to the backend model, which
processes the inputs using a pre-trained machine learning algorithm. The model considers all the
factors—crop type, soil type, weather condition, region, and temperature—to calculate the
estimated water requirement for the selected crop.
The prediction result is then displayed on the same page, offering the user actionable insights on
how much water to apply to their crops. This immediate feedback helps farmers and gardeners
49
make informed decisions, optimizing water usage, promoting sustainable farming practices, and
ensuring healthy crop growth.
4.2 Discussion
The system's design prioritizes accessibility and ease of use, requiring only basic internet access
and a modern web browser to function. This ensures that users can interact with the application
from a variety of devices, making it highly accessible to farmers and gardeners regardless of
their technical expertise. A stable internet connection is recommended for smooth operation,
allowing the application to process inputs and deliver predictions efficiently.
The application itself features an intuitive interface where users can easily select various factors
like crop type, soil type, weather conditions, region, and temperature. These inputs are critical
for accurately predicting the water requirements for different crops. Once the inputs are
submitted, the backend model processes the data using a pre-trained machine learning algorithm
to provide actionable insights on water usage. Despite the model's current performance metrics
indicating room for improvement, the application effectively helps users make informed
decisions to optimize water usage and support sustainable farming practices.
CHAPTER FIVE
50
of the world’s freshwater. With the growing global population and the need for sustainable
farming practices, optimizing water use is crucial. The research focuses on developing a Random
Forest Regressor model for real-time monitoring and optimization of water usage. The model
utilizes data such as crop type, soil type, weather conditions, and temperature to predict water
requirements accurately. The system also includes a user-friendly interface for farmers to easily
interact with the model and make informed irrigation decisions.
Conclusion
The implementation of machine learning and IoT technologies in agriculture, as demonstrated by
the developed system, shows significant potential in enhancing water-use efficiency and
supporting sustainable farming practices. The model’s ability to predict water requirements
based on real-time data can lead to healthier crops, reduced water wastage, and lower energy
costs. However, the model's current performance metrics, such as the mean absolute error and R-
squared score, indicate that there is room for improvement in its accuracy and reliability.
Limitations
The study's limitations include the model's performance, which still requires refinement to
improve prediction accuracy. Additionally, the reliance on specific data types, such as the crop
and soil types, means that the model may not generalize well to different agricultural contexts or
regions without further adjustments. The complexity of integrating diverse data sources, the
potential for overfitting, and the need for continuous data updates also present challenges.
Moreover, the system's effectiveness depends on the quality and availability of real-time data,
which may not be accessible in all farming environments.
Recommendation
To enhance the system's accuracy and applicability, it is recommended to expand the dataset to
include a wider variety of crops, soil types, and regions, allowing the model to generalize better.
Further research should focus on improving the model's predictive capabilities by exploring
advanced machine learning techniques and conducting cross-validation to prevent overfitting.
Additionally, integrating more sophisticated IoT sensors and ensuring the continuous updating of
data will improve real-time decision-making. Training programs for farmers on using the system
effectively will also facilitate broader adoption and maximize the benefits of this technology in
optimizing water use in agriculture.
51
References
Bilzikova, et al. (2020). A scoping review of the contributions of farmers' organizations to
smallholder agriculture.
Bhat, S., & Huang, X. (2021). Big Data and AI revolution in precision agriculture: Survey and
challenges.
52
Faouzi, et al. (2020). Wastewater reuse in agriculture sector: Resources management and
adaptation in the context of climate change: Case study of the Beni Mellal-Khenifra region
Morocco.
Foster, et al. (2020). Satellite-based monitoring of irrigation water use: Assessing measurement
errors and their implications for agricultural water management policy.
Huang, et al. (2020). Water-saving agriculture can deliver deep water cuts for China.
Javaid, et al. (2023). Understanding the potential applications of Artificial Intelligence in the
agriculture sector.
Jin, et al. (2020). Deep learning predictor for sustainable precision agriculture based on
Internet of Things system.
Qazi, et al. (2022). IoT-Equipped and AI-Enabled Next Generation Smart Agriculture: A Critical
Review Current Challenges and Future Trend.
Saad, et al. (2020). Water management in agriculture: A survey on current challenges and
technological solutions.
Sharma, et al. (2021). Machine learning applications for precision agriculture: A comprehensive
review.
Steinfeld, et al. (2020). The human dimension of water availability: Influence of management
rules on water supply for irrigated agriculture and the environment.
Veeragandham, & Santhi, (2020). A review on the role of machine learning in agriculture.
Zhang, et al. (2020). Applications of deep learning for dense scenes analysis in agriculture: A
review.
53
Zhai, et al. (2020). Decision support systems for agriculture 4.0: Survey and challenges.
Benos, L., Tagarakis, A. C., Dolijanovic, Z., Bochtis, D., & Ampatzidis, Y. (2021). Machine
learning in agriculture: A comprehensive updated review. Agronomy, 11(9), 1784.
Bhat, S., & Huang, X. (2021). Big data and AI revolution in precision agriculture: Survey and
challenges. Computers and Electronics in Agriculture, 187, 106240.
Faouzi, S., Bargaoui, Z., & Hambli, A. (2020). Wastewater reuse in agriculture sector: Resources
management and adaptation in the context of climate change: Case study of the Beni Mellal-
Khenifra region, Morocco. Journal of Environmental Management, 273, 111070.
Javaid, M. M., Waseem, M., Qamar, U., & Ahmad, M. (2023). Understanding the potential
applications of Artificial Intelligence in the agriculture sector. Computers and Electronics in
Agriculture, 207, 107693.
Jin, X., Sun, X., & Geng, X. (2020). Deep learning predictor for sustainable precision agriculture
based on Internet of Things system. Sustainable Cities and Society, 60, 102216.
Meshram, P., Sharma, P., & Shukla, P. (2021). Machine learning in agriculture domain: A state-
of-art survey. Materials Today: Proceedings, 47, 1200-1206.
Sharma, A., Khatri, P., & Kumar, S. (2021). Machine learning applications for precision
agriculture: A comprehensive review. Journal of Artificial Intelligence and Soft Computing
Research, 11(2), 99-111.
Steinfeld, H., Gerber, P., & Wassenaar, T. (2020). The human dimension of water availability:
Influence of management rules on water supply for irrigated agriculture and the environment.
Water, 12(8), 2167.
Veeragandham, M., & Santhi, S. (2020). A review on the role of machine learning in agriculture.
Journal of Agricultural Engineering, 57(1), 1-14.
Wang, H., Huang, Y., Chen, H., & Zhang, J. (2022). A review of deep learning in multiscale
agricultural sensing. Remote Sensing, 14(1), 27.
54
Zhang, Q., Liu, J., & Yang, B. (2020). Applications of deep learning for dense scenes analysis in
agriculture: A review. Computers and Electronics in Agriculture, 176, 105672.
Appendix
import streamlit as st
import pandas as pd
import joblib
model = joblib.load('water_requirement_model.pkl')
55
column_names = joblib.load('column_names.pkl')
st.markdown(
"""
<style>
.stApp {
url("https://plus.unsplash.com/premium_photo-1661825536186-19606cd9a0f1?
w=400&auto=format&fit=crop&q=60&ixlib=rb-
4.0.3&ixid=M3wxMjA3fDB8MHxzZWFyY2h8NXx8d2F0ZXIlMjB1c2UlMjBpbiUyMGFncml
jdWx0dXJlfGVufDB8fDB8fHww");
background-size: cover;
background-position: center;
</style>
""",
unsafe_allow_html=True
56
# Input fields on the main page instead of the sidebar
st.header('Input Parameters')
])
input_values = {
57
'CROP TYPE_MELON': 1 if crop_type == 'MELON' else 0,
58
}
# Prediction
if st.button('Predict'):
prediction = model.predict(input_df)[0]
st.markdown(
unsafe_allow_html=True
59