0% found this document useful (0 votes)
46 views11 pages

Ai-Based Machine Learning Algorithms For Water Quality Analysis: A Review

Uploaded by

Imtiaz Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views11 pages

Ai-Based Machine Learning Algorithms For Water Quality Analysis: A Review

Uploaded by

Imtiaz Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/380150147

AI-BASED MACHINE LEARNING ALGORITHMS FOR WATER QUALITY


ANALYSIS: A REVIEW

Conference Paper · April 2024

CITATION READS

1 137

4 authors:

Nilusha Perera Warunika Karunasiri


Uva Wellassa University University of Colombo
83 PUBLICATIONS 108 CITATIONS 6 PUBLICATIONS 1 CITATION

SEE PROFILE SEE PROFILE

R M L S Rajapaksha Kosala Sirisena


University of Colombo University of Colombo
2 PUBLICATIONS 1 CITATION 20 PUBLICATIONS 89 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Nilusha Perera on 27 April 2024.

The user has requested enhancement of the downloaded file.


ICSBE 2023-268
AI-BASED MACHINE LEARNING ALGORITHMS FOR WATER QUALITY ANALYSIS:
A REVIEW

R.M.L.S. Rajapaksha1*, R.P.W.N. Karunasiri1, T.A.N.T. Perera2, K. Sirisena1


1
Department of Environmental Technology, Faculty of Technology, University of Colombo, Sri Lanka
2
Department of Export Agriculture, Faculty of Animal Science and Export Agriculture, Uva Wellassa University of
Sri Lanka
*Correspondence E-mail: [email protected], TP: +94773132103

Abstract: Water quality assessment is a basic concept of environmental management bearing


profound implications for human health and well-being. Artificial Intelligence (AI) based ma-
chine learning (ML) algorithms have steered in a paradigm shift by enabling precise, efficient,
and real-time analysis of water quality traits. However, in-depth analysis of the ever-evolving AI-
driven ML applications in the domain of water quality analysis is absent in tropical regions. There-
fore, this review was conducted (a) to explore ML algorithms in groundwater, surface water, and
drinking water assessments, and (b) to investigate potential future applications of ML algorithms
to water quality studies. From 2012 to 2022, 30 indexed journal articles were reviewed in terms
of the characteristics and capacities of creating methods, taking into account data of input-output,
etc. This review is confined to English language scholarly peer-reviewed journal papers, and
books in Science Direct. Key terms driving the exploration include Artificial Intelligence, Ma-
chine Learning and Water Quality. ML algorithms have advanced water quality assessment across
diverse aquatic contexts. They enable the analysis of extensive datasets, providing real-time in-
sights and accurate predictions for improved water quality management. ML models in surface
water quality evaluation use data from multiple sources, such as remote sensing and in-situ sen-
sors, to monitor parameters such as turbidity and dissolved oxygen. They support early pollution
identification and management measures. These algorithms analyze data from monitoring wells
to ensure the safety of drinking water sources when monitoring groundwater quality. By assessing
water quality data and operational parameters, ML optimizes drinking water treatment operations.
It anticipates and adjusts characteristics such as temperature and pH in aquaculture management
to optimize operations. Future applications include early warning systems for pollution incidents
and assessing the impact of climate change. The evolution of ML promises more informed deci-
sion-making and improved water quality management across aquatic environments. This review,
concluded a comprehensive synthesis of perceptions drawn from diverse research endeavours,
not only enhances the current state of AI-based water quality analysis but also delineates the path
for future scientific exploration, thereby thrusting water quality management and ecological sus-
tainability to new heights.

Keywords: Artificial Intelligence; Groundwater; Machine Learning; Surface Water; Water Qual-
ity

1432
1. Introduction
Water is essential for all living beings no matter where they live or what sort of a lifestyle they
have. It plays a significant role in maintaining human health and welfare. Therefore, drinking
water needs to be purified to be consumed and used for other purposes. Water quality and suit-
ability for use are determined by its taste, odour, colour, and concentration of organic and inor-
ganic matter content (Dharmasena and Project, 2021). Some people use municipal water sup-
plies, and some have their private well water supplies which are not monitored by government
or municipal agencies. This means the well owner must take the initiative to monitor well water
quality on their own. For the last few decades, most of the well water sources have been sub-
jected to pollution by several sources such as agricultural activities, human settlement, defor-
estation, and industrial waste etc (Alharthi, 2015). Developing nations frequently experience
worse conditions because those economies depend heavily on agriculture, which frequently pro-
vides millions of people with their main source of in (Trentinaglia, Baldi and Peri, 2023). Due
to a lack of resources and technology, many farmers in these nations are unable to adopt sus-
tainable and ecologically friendly agricultural practices. This results in widespread water source
pollution, local ecological deterioration, and health issues for those who depend on them
(Zahoor and Mushtaq, 2023).

This review aims to delve into the dynamic landscape of AI-based ML algorithms in water qual-
ity analysis. It will explore their applications across various domains, including groundwater,
surface water, and drinking water assessments. Drinking water quality pollution in heavily ag-
ricultural areas can be identified by using pH, electrical conductivity (EC), Total Dissolve Solid
(TDS), Nitrate (NO3-), Potassium (K+), Iron (Fe), Manganese (Mn2+), Magnesium (Mg2+), Lead
(Pb2+), Cadmium (Cd2+), Zinc (Zn2+), Copper (Cu2+), Arsenic (As) and a few ML techniques
show significant prediction results in water quality analysis. Logistic regression (LR), support
vector machine (SVM) and random forest (RF) are based on parameters like accuracy, precision
and recall. Furthermore, this paper will shed light on the challenges associated with implement-
ing these algorithms and highlight potential future applications that hold promise for advancing
our understanding of water quality. As the world grapples with evolving water-related chal-
lenges, ranging from contamination threats to the sustainable management of water resources,
this review seeks to provide a comprehensive overview of how AI-based ML algorithms are
reshaping the field of water quality analysis. By examining their current use, addressing chal-
lenges, and envisioning future possibilities, we aim to underscore the significance of this tech-
nological frontier in safeguarding our most precious resource water.

2. ML algorithms in water quality analysis

2.1 ML algorithms in groundwater


During the past three decades, ML techniques have been widely employed when compared to
statistical methods due to the effectiveness of the former in estimating groundwater quality for
drinking purposes (Trentinaglia, Baldi and Peri, 2023). Groundwater contamination could cause
considerable risks to both human health and the environment. It could be the result of leaching
organic material, pesticides, and nitrates into aquifers through industrial, agricultural and vari-
ous other human activities. Numerous studies have assessed the portability of groundwater using
the following key factors: Total Hardness (TH), Calcium (Ca 2+), Sodium (Na+), Sulphate (SO42-
), chloride (Cl-), Bicarbonate (HCO3-), and Fluoride (F-), pH, EC, Mg2+, K+, NO3- (Garba, 2018).
Due to their enhanced performance over the past three decades when compared to statistical
methods, a variety of machine-learning approaches have been widely employed to forecast the
quality of groundwater for drinking reasons. In India, ML technologies are being successfully
used since data on water quality is easily accessible from a variety of sources (Pandey and
Verma, 2022). These new methods are utilised to analyse and further estimate the water quality
in contrast to previously reported data to assist in the development of new ML technologies like
1433
Deep Neural Network and XGBoost utilising different traditional ML models (Manuel, Pedro
and Pires, 2021).

To forecast wastewater quality, SVR and RT can be used (Granata, Papirio and Esposito, 2017).
ML algorithms play a pivotal role in safeguarding the quality of groundwater, an indispensable
source of freshwater. Convolutional neural networks (CNNs) and regression models excel in
predicting contamination levels by analyzing historical groundwater quality data alongside ge-
ological characteristics, land usage, and pollution incidents (Busico, Kazakis and Cuoco, 2020).
These algorithms optimize remediation strategies, recommending cost-effective and environ-
mentally friendly solutions through reinforcement learning (Bionics, 2023). Real-time monitor-
ing systems, powered by ML, swiftly detect anomalies in groundwater well sensor data, ena-
bling prompt action. Geographic information system (GIS) integration aids in spatial analysis,
identifying pollution hotspots for targeted intervention (Union, 2021). Additionally, data fusion
techniques combine various datasets to reveal complex relationships, showcasing ML's adapta-
bility in safeguarding this vital resource, whether predicting contamination levels, optimizing
remediation, or enabling real-time monitoring (Huang and Fohrer, 2023). Common techniques
like RF, DT, Neural Networks, K-Nearest Neighbors, Naive Bayes, and SVM further enhance
groundwater quality analysis by providing robust predictions and interpretability (Tao, Majeed
and Abdulameer, 2022).

2.2 ML algorithms in surface water


ML has emerged as a powerful tool for the comprehensive management and understanding of
surface water quality in vital aquatic ecosystems like rivers, lakes, and reservoirs (Miller and
Durlik, 2023). These algorithms play a multifaceted role in safeguarding these invaluable water
sources. Firstly, ML algorithms enable continuous monitoring and assessment by seamlessly
integrating data from various sources, including water quality sensors, weather stations, and
high-resolution satellite imagery (Zhu, Wang and Yang, 2022). This amalgamation of data
sources empowers ML models to provide real-time insights into the ever-changing condition of
surface waters. These models excel at detecting anomalies and trends, which can be indicative
of pollution events or ecological shifts. This real-time monitoring capability is indispensable for
early intervention and timely decision-making, crucial for ensuring water. Secondly, ML shines
in predictive modelling, offering accurate forecasts of pollution levels and water quality param-
eters. Algorithms such as RF, Gradient Boosting, and Long Short-Term Memory (LSTM) net-
works harness historical data and environmental variables like rainfall and land use to create
models capable of predicting water pollution levels in rivers and lakes(Surucu, Andrew and
Yawney, 2023). This predictive capability aids in proactively managing water resources, allow-
ing authorities and environmentalists to implement preventive measures and optimize resource
allocation. Moreover, ML-driven anomaly detection systems are pivotal components of surface
water quality monitoring (Shams, Elshewey and Sayed, 2023). Leveraging techniques such as
clustering and outlier detection, these systems can identify unusual patterns or deviations from
expected water quality conditions. The detection of such anomalies acts as a trigger, prompting
further investigation and facilitating rapid response actions, which are critical for mitigating
potential environmental hazards. Furthermore, ML algorithms contribute significantly to the as-
sessment of the health of aquatic ecosystems. By analysing water quality data in tandem with
ecological indicators, scientists gain invaluable insights into the complex relationship between
water quality and aquatic life. This holistic approach enables a deeper understanding of the im-
pact of water quality on biodiversity, facilitating more informed decisions regarding the conser-
vation and restoration of these ecosystems. Lastly, the seamless integration of ML with the In-
ternet of Things (IOT) is instrumental in modern surface water quality monitoring. IOT devices
such as water quality sensors and underwater drones continuously collect data from various
points within these ecosystems. ML algorithms process and analyse this influx of data in real-
time, turning it into actionable information. This synergy between ML and IOT ensures that
1434
decision-makers have access to up-to-date and comprehensive data for effective resource man-
agement and environmental protection (Cui, Yang and Zuo, 2018). In conclusion, ML has be-
come an indispensable ally in the quest to understand, manage, and safeguard the quality of
surface water bodies. Its multifaceted contributions span real-time monitoring, predictive mod-
elling, anomaly detection, ecosystem health assessment, and seamless integration with IOT de-
vices. As we continue to grapple with environmental challenges, ML stands as an asset in our
efforts to preserve and protect these vital sources of freshwater for generations to come.

Certainly, in surface water quality analysis, various ML models are commonly employed. Ran-
dom Forest, known for its robustness and suitability for large datasets, predicts pollution levels
by considering environmental factors. Gradient Boosting techniques like XGBoost and
LightGBM offer high predictive accuracy for water quality parameters. Long Short-Term
Memory (LSTM) networks excel in modelling time series data, capturing water quality fluctu-
ations over time. CNNs extend their image recognition capabilities to analyze remote sensing
or underwater images, aiding in pattern and anomaly detection. Lastly, SVM provides versatil-
ity, serving for both classification and regression tasks, enabling the classification of water sam-
ples and prediction of pollutant concentrations based on input features (Rodr, Vogt and Bajorath,
2017). These models collectively empower researchers and environmental experts to better un-
derstand and manage surface water quality (Zhu, Wang and Yang, 2022).

These ML models can be adapted to specific surface water quality prediction or monitoring
tasks based on the available data and the objectives of the analysis. Researchers and practitioners
often choose the most suitable model or combination of models based on the complexity of the
problem and the performance requirements of the application.

2.3 ML algorithms in drinking water assessments


ML techniques are increasingly pivotal in the assessment of drinking water quality and safety.
Several common methods find application in this domain. SVM for instance, are instrumental
in classifying water samples as "safe" or "contaminated" based on chemical, microbial, and
physical parameters. Random Forest and Decision Trees are versatile, predicting water quality
parameters and categorizing water sources by quality. Neural Networks, including CNNs and
recurrent neural networks (RNNs), are employed for tasks like microorganism detection in sen-
sor images and modelling temporal dependencies in water quality time series. Gradient Boosting
Machines, such as XGBoost and LightGBM, excel in predictive modelling using historical and
environmental data. Anomaly detection algorithms identify unusual patterns, aiding in the de-
tection of contamination events. Principal Component Analysis (PCA) streamlines data dimen-
sionality, cluster analysis groups similar samples, and k-means clustering identifies regions with
shared water quality characteristics. These machine-learning techniques collectively enhance
our ability to safeguard drinking water supplies efficiently and effectively. Time Series Analysis
For continuous monitoring of drinking water quality, time series analysis techniques like Auto-
regressive Integrated Moving Average (ARIMA) or LSTM networks can model and predict
water quality variations over time (Shams, Elshewey and Sayed, 2023). The choice of ML
method depends on the specific goals of the assessment, the nature of the data available, and the
desired level of accuracy and interpretability. ML enables efficient and data-driven decision-
making in ensuring safe and high-quality drinking water for communities. ML plays a pivotal
role in the realm of drinking water assessments by contributing to the continuous monitoring
and maintenance of safe and clean drinking water supplies. This section will delve into how ML
algorithms are employed to guarantee the safety of drinking water sources, emphasizing their
importance in safeguarding public health.ML models can predict the concentrations of contam-
inants and other critical water quality parameters. By analyzing historical data and sensor inputs,
these algorithms can detect deviations from established norms, ensuring the early detection of
potential hazards (Soori, Arezoo and Dastres, 2023).
1435
3. Challenges
In the realm of research, and modelling for water quality analysis, several critical challenges
come to the forefront. Firstly, model interpretability is paramount. While ML models can offer
accurate predictions, their complexity can make it challenging to understand why they make
specific decisions. This transparency is especially crucial in water quality analysis, where deci-
sions impact public health and the environment. Researchers must actively work on developing
interpretable models and post hoc interpretability techniques to ensure that model outcomes are
explainable. Addressing these challenges is integral to the success of research thesis modelling
in water quality analysis. Researchers and practitioners continually strive to devise solutions
and best practices to overcome these obstacles, unlocking the full potential of AI-based ML
algorithms in safeguarding precious water resources.

4. Potential future applications of ML algorithms to water quality studies

Table 1: Potential future applications and Research gaps

Area of Application Potential future applica- Research Gaps


tions
Real-time Water qual- Continuous monitoring, Early Developing cost-effective sen-
ity monitoring contamination detection, Au- sor networks, addressing data
tomated alert systems integration challenges
Predictive Modelling Predicting pollution sources, Enhancing predictive accuracy,
for Pollution Events Proactive management, Eco- considering complex environ-
system impact assessment mental interactions integrating
remote sensing data
Intergraded water Adaptive management, Deci- Scalability and interoperability
quality management sion support systems, cross- of integrated systems, Data pri-
domain data inter-integration vacy and ethics considerations,
Effective stakeholder engage-
ment
Source identification Efficient source localization, Precise source identification al-
source data fusion, Remote gorithms, handling diverse data
Sensing integration types, ensuring real-time capa-
bilities
Adaptive water treat- Real-time optimization, En- Optimization under varying wa-
ment ergy efficiency, Contaminant ter quality conditions, Minimiz-
removal ing environmental impact, inte-
grating Al with existing
treatment infrastructure
Epidemiological Stud- Identifying health water qual- Enhanced data quality and relia-
ies ity links, early disease out- bility, Privacy-preserving data
break detection, sharing, Interdisciplinary col-
laboration
Climate change impact Assessing climate change ef- Modelling climate impact at re-
assessment fects, Adaption strategies, gional scales, incorporating un-
long-term modelling certainty in predictions, quanti-
fying potential feedback loops

Data Fusion for Water Integration of heterogeneous Effective data fusion algo-
Quality data sources, Enhancing data rithms, Data accuracy and con-
quality sistency, addressing data silos

1436
Environmental Reme- Effective remediation strate- Models for optimal site selec-
diation gies, cost-efficient solutions, tion, long-term remediation ef-
Impact minimization fectiveness assessment, and ho-
listic environmental impact
assessment.

3. Conclusion
The integration of ML algorithms into water quality analysis has undeniably revolutionized the
field of environmental science and resource management. This review highlights the remarkable
progress made by ML in enhancing our ability to monitor, predict, and safeguard water quality
in diverse domains, encompassing groundwater, surface water, and drinking water assessments.
The transformative impact of ML algorithms in water quality analysis cannot be overstated.
From anticipating contamination events in groundwater to optimizing the treatment processes
for safe drinking water, these algorithms have empowered researchers, regulators, and stake-
holders to make informed decisions and respond effectively to emerging challenges. Issues re-
lated to data quality, the interpretability of intricate models, and the scalability of algorithms for
large-scale monitoring networks demand ongoing attention and inventive solutions. In light of
these challenges, we emphasize the urgent need for continued research and innovation in the
realm of ML-based water quality analysis. Collaborative endeavours involving scientists, engi-
neers, policymakers, and industry experts are essential to bridge existing research gaps, refine
current algorithms, and pioneer novel approaches to overcome the limitations of current meth-
odologies. The future of water quality analysis resides at the nexus of cutting-edge technology
and environmental stewardship. As we embark on this journey, our commitment remains unwa-
vering to harness the capabilities of ML in safeguarding our most invaluable resource: water.
Through fostering interdisciplinary collaboration and investing in research and development,
we collectively chart a course towards a sustainable, water-secure future. In summation, the
fusion of ML algorithms with water quality analysis stands as a testament to human innovation
and the ever-evolving pursuit of knowledge. It beckons us to push the boundaries of what is
achievable and reiterates our duty to preserve and protect the life-sustaining waters that nourish
us all.

1437
References

Alharthi, F.A. (2015) ‘Analysis of Physiochemical Parameters to Evaluate the Drinking Water
Quality in the State of Perak, Malaysia’, 2015(Cd).
Bionics, A. (2023) ‘Retraction Retracted: Water Quality Prediction Using Artificial Intelligence
Algorithms’, 2020.
Busico, G., Kazakis, N. and Cuoco (2020) ‘A novel hybrid method of specific vulnerability to
anthropogenic pollution using multivariate statistical and regression analyses’, Water Research,
p. 115386. Available at: https://doi.org/10.1016/j.watres.2019.115386.
Chen, K. et al. (2020) ‘Comparative analysis of surface water quality prediction performance and
identification of key water parameters using different machine learning models based on big
data’, Water Research, 171, p. 115454. Available at:
https://doi.org/10.1016/j.watres.2019.115454.
Cui, L., Yang, S. and Zuo, M. (2018) ‘A survey on the application of machine learning for Internet of
Things’, International Journal of Machine Learning and Cybernetics, 0(0), p. 0.
Available at: https://doi.org/10.1007/s13042-018-0834-5.
Dharmasena, P.B. and Project, W.P. (2021) ‘Current Status of Land Degradation in Badulla
District Current Status of Land Degradation in Badulla District Rehabilitation of Degraded
Agricultural Lands in Kandy, Badulla and P . B . Dharmasena National Consultant – Land
management and SLM Information Management July 2014’, (November).
Garba, M.A. (2018) ‘PHYSICOCHEMICAL ANALYSIS OF GROUNDWATER SAMPLES OF
GWOZA TOWN AND ENVIRONS, NORTHEASTERN NIGERIA .’, (December).
Granata, F., Papirio, S. and Esposito (2017) ‘Machine Learning Algorithms for the Forecasting of
Wastewater Quality Indicators’, pp. 1–12. Available at: https://doi.org/10.3390/w9020105.
Huang, J. and Fohrer, N. (2023) ‘A grid-based interpretable machine learning method to understand the
spatial relationships between watershed properties and water quality’, 154(July). Available at:
https://doi.org/10.1016/j.ecolind.2023.110627.
Manuel, C., Pedro, M. and Pires, M. (2021) ‘A Comparison of AutoML Tools for Machine
Learning, Deep Learning and XGBoost’, (October). Available at:
https://doi.org/10.1109/IJCNN52387.2021.9534091.
Miller, T. and Durlik, I. (2023) ‘Applied Sciences Predictive Modeling of Urban Lake Water
Quality Using Machine Learning : A 20-Year Study’.
Pandey, J. and Verma, S. (2022) ‘Water Quality Prediction using Artificial Intelligence and
Machine learning Algorithms’, 71(4), pp. 6114–6132.
Rodr, R., Vogt, M. and Bajorath (2017) ‘Support Vector Machine Classi fi cation and Regression
Prioritize Di ff erent Structural Features for Binary Compound Activity and Potency Value
Prediction’. Available at: https://doi.org/10.1021/acsomega.7b01079.
Shams, M.Y., Elshewey, A.M. and Sayed (2023) ‘Water quality prediction using machine learning
models based on grid search method’, Multimedia Tools and Applications [Preprint],
(0123456789). Available at: https://doi.org/10.1007/s11042-023- 16737-4.
Soori, M., Arezoo, B. and Dastres, R. (2023) ‘Artificial intelligence, machine learning and deep
learning in advanced robotics, a review’, Cognitive Robotics, 3(April), pp. 54–70. Available at:
https://doi.org/10.1016/j.cogr.2023.04.001.
Surucu, O., Andrew, S. and Yawney, J. (2023) ‘Condition Monitoring using Machine Learning : A
Review of Theory, Applications, and Recent Advances’, Expert Systems With Applications,
221(October 2021), p. 119738. Available at: https://doi.org/10.1016/j.eswa.2023.119738.
Tao, H., Majeed, M. and Abdulameer (2022) ‘Neurocomputing Groundwater level prediction using
machine learning models : A comprehensive review’, Neurocomputing, 489, pp. 271–308.
Available at: https://doi.org/10.1016/j.neucom.2022.03.014.
Trentinaglia, M.T., Baldi, L. and Peri, M. (2023) ‘Supporting agriculture in developing countries : new
insights on the impact of official development assistance using a climate perspective’,
Agricultural and Food Economics [Preprint]. Available at:
1438
https://doi.org/10.1186/s40100-023-00282-7.
Union, I.T. (2021) United Nations Activities on Artificial Intelligence (AI).
Zahoor, I. and Mushtaq, A. (2023) ‘Water Pollution from Agricultural Activities : A Critical Global
Review’, 23(1), pp. 164–176.
Zhu, M., Wang, J. and Yang (2022) ‘Eco-Environment & Health A review of the application of
machine learning in water quality evaluation’, Eco-Environment & Health, 1(2), pp. 107–116.
Available at: https://doi.org/10.1016/j.eehl.2022.06.001.

1439
View publication stats

You might also like