Ai-Based Machine Learning Algorithms For Water Quality Analysis: A Review
Ai-Based Machine Learning Algorithms For Water Quality Analysis: A Review
net/publication/380150147
CITATION READS
1 137
4 authors:
All content following this page was uploaded by Nilusha Perera on 27 April 2024.
Keywords: Artificial Intelligence; Groundwater; Machine Learning; Surface Water; Water Qual-
ity
1432
1. Introduction
Water is essential for all living beings no matter where they live or what sort of a lifestyle they
have. It plays a significant role in maintaining human health and welfare. Therefore, drinking
water needs to be purified to be consumed and used for other purposes. Water quality and suit-
ability for use are determined by its taste, odour, colour, and concentration of organic and inor-
ganic matter content (Dharmasena and Project, 2021). Some people use municipal water sup-
plies, and some have their private well water supplies which are not monitored by government
or municipal agencies. This means the well owner must take the initiative to monitor well water
quality on their own. For the last few decades, most of the well water sources have been sub-
jected to pollution by several sources such as agricultural activities, human settlement, defor-
estation, and industrial waste etc (Alharthi, 2015). Developing nations frequently experience
worse conditions because those economies depend heavily on agriculture, which frequently pro-
vides millions of people with their main source of in (Trentinaglia, Baldi and Peri, 2023). Due
to a lack of resources and technology, many farmers in these nations are unable to adopt sus-
tainable and ecologically friendly agricultural practices. This results in widespread water source
pollution, local ecological deterioration, and health issues for those who depend on them
(Zahoor and Mushtaq, 2023).
This review aims to delve into the dynamic landscape of AI-based ML algorithms in water qual-
ity analysis. It will explore their applications across various domains, including groundwater,
surface water, and drinking water assessments. Drinking water quality pollution in heavily ag-
ricultural areas can be identified by using pH, electrical conductivity (EC), Total Dissolve Solid
(TDS), Nitrate (NO3-), Potassium (K+), Iron (Fe), Manganese (Mn2+), Magnesium (Mg2+), Lead
(Pb2+), Cadmium (Cd2+), Zinc (Zn2+), Copper (Cu2+), Arsenic (As) and a few ML techniques
show significant prediction results in water quality analysis. Logistic regression (LR), support
vector machine (SVM) and random forest (RF) are based on parameters like accuracy, precision
and recall. Furthermore, this paper will shed light on the challenges associated with implement-
ing these algorithms and highlight potential future applications that hold promise for advancing
our understanding of water quality. As the world grapples with evolving water-related chal-
lenges, ranging from contamination threats to the sustainable management of water resources,
this review seeks to provide a comprehensive overview of how AI-based ML algorithms are
reshaping the field of water quality analysis. By examining their current use, addressing chal-
lenges, and envisioning future possibilities, we aim to underscore the significance of this tech-
nological frontier in safeguarding our most precious resource water.
To forecast wastewater quality, SVR and RT can be used (Granata, Papirio and Esposito, 2017).
ML algorithms play a pivotal role in safeguarding the quality of groundwater, an indispensable
source of freshwater. Convolutional neural networks (CNNs) and regression models excel in
predicting contamination levels by analyzing historical groundwater quality data alongside ge-
ological characteristics, land usage, and pollution incidents (Busico, Kazakis and Cuoco, 2020).
These algorithms optimize remediation strategies, recommending cost-effective and environ-
mentally friendly solutions through reinforcement learning (Bionics, 2023). Real-time monitor-
ing systems, powered by ML, swiftly detect anomalies in groundwater well sensor data, ena-
bling prompt action. Geographic information system (GIS) integration aids in spatial analysis,
identifying pollution hotspots for targeted intervention (Union, 2021). Additionally, data fusion
techniques combine various datasets to reveal complex relationships, showcasing ML's adapta-
bility in safeguarding this vital resource, whether predicting contamination levels, optimizing
remediation, or enabling real-time monitoring (Huang and Fohrer, 2023). Common techniques
like RF, DT, Neural Networks, K-Nearest Neighbors, Naive Bayes, and SVM further enhance
groundwater quality analysis by providing robust predictions and interpretability (Tao, Majeed
and Abdulameer, 2022).
Certainly, in surface water quality analysis, various ML models are commonly employed. Ran-
dom Forest, known for its robustness and suitability for large datasets, predicts pollution levels
by considering environmental factors. Gradient Boosting techniques like XGBoost and
LightGBM offer high predictive accuracy for water quality parameters. Long Short-Term
Memory (LSTM) networks excel in modelling time series data, capturing water quality fluctu-
ations over time. CNNs extend their image recognition capabilities to analyze remote sensing
or underwater images, aiding in pattern and anomaly detection. Lastly, SVM provides versatil-
ity, serving for both classification and regression tasks, enabling the classification of water sam-
ples and prediction of pollutant concentrations based on input features (Rodr, Vogt and Bajorath,
2017). These models collectively empower researchers and environmental experts to better un-
derstand and manage surface water quality (Zhu, Wang and Yang, 2022).
These ML models can be adapted to specific surface water quality prediction or monitoring
tasks based on the available data and the objectives of the analysis. Researchers and practitioners
often choose the most suitable model or combination of models based on the complexity of the
problem and the performance requirements of the application.
Data Fusion for Water Integration of heterogeneous Effective data fusion algo-
Quality data sources, Enhancing data rithms, Data accuracy and con-
quality sistency, addressing data silos
1436
Environmental Reme- Effective remediation strate- Models for optimal site selec-
diation gies, cost-efficient solutions, tion, long-term remediation ef-
Impact minimization fectiveness assessment, and ho-
listic environmental impact
assessment.
3. Conclusion
The integration of ML algorithms into water quality analysis has undeniably revolutionized the
field of environmental science and resource management. This review highlights the remarkable
progress made by ML in enhancing our ability to monitor, predict, and safeguard water quality
in diverse domains, encompassing groundwater, surface water, and drinking water assessments.
The transformative impact of ML algorithms in water quality analysis cannot be overstated.
From anticipating contamination events in groundwater to optimizing the treatment processes
for safe drinking water, these algorithms have empowered researchers, regulators, and stake-
holders to make informed decisions and respond effectively to emerging challenges. Issues re-
lated to data quality, the interpretability of intricate models, and the scalability of algorithms for
large-scale monitoring networks demand ongoing attention and inventive solutions. In light of
these challenges, we emphasize the urgent need for continued research and innovation in the
realm of ML-based water quality analysis. Collaborative endeavours involving scientists, engi-
neers, policymakers, and industry experts are essential to bridge existing research gaps, refine
current algorithms, and pioneer novel approaches to overcome the limitations of current meth-
odologies. The future of water quality analysis resides at the nexus of cutting-edge technology
and environmental stewardship. As we embark on this journey, our commitment remains unwa-
vering to harness the capabilities of ML in safeguarding our most invaluable resource: water.
Through fostering interdisciplinary collaboration and investing in research and development,
we collectively chart a course towards a sustainable, water-secure future. In summation, the
fusion of ML algorithms with water quality analysis stands as a testament to human innovation
and the ever-evolving pursuit of knowledge. It beckons us to push the boundaries of what is
achievable and reiterates our duty to preserve and protect the life-sustaining waters that nourish
us all.
1437
References
Alharthi, F.A. (2015) ‘Analysis of Physiochemical Parameters to Evaluate the Drinking Water
Quality in the State of Perak, Malaysia’, 2015(Cd).
Bionics, A. (2023) ‘Retraction Retracted: Water Quality Prediction Using Artificial Intelligence
Algorithms’, 2020.
Busico, G., Kazakis, N. and Cuoco (2020) ‘A novel hybrid method of specific vulnerability to
anthropogenic pollution using multivariate statistical and regression analyses’, Water Research,
p. 115386. Available at: https://doi.org/10.1016/j.watres.2019.115386.
Chen, K. et al. (2020) ‘Comparative analysis of surface water quality prediction performance and
identification of key water parameters using different machine learning models based on big
data’, Water Research, 171, p. 115454. Available at:
https://doi.org/10.1016/j.watres.2019.115454.
Cui, L., Yang, S. and Zuo, M. (2018) ‘A survey on the application of machine learning for Internet of
Things’, International Journal of Machine Learning and Cybernetics, 0(0), p. 0.
Available at: https://doi.org/10.1007/s13042-018-0834-5.
Dharmasena, P.B. and Project, W.P. (2021) ‘Current Status of Land Degradation in Badulla
District Current Status of Land Degradation in Badulla District Rehabilitation of Degraded
Agricultural Lands in Kandy, Badulla and P . B . Dharmasena National Consultant – Land
management and SLM Information Management July 2014’, (November).
Garba, M.A. (2018) ‘PHYSICOCHEMICAL ANALYSIS OF GROUNDWATER SAMPLES OF
GWOZA TOWN AND ENVIRONS, NORTHEASTERN NIGERIA .’, (December).
Granata, F., Papirio, S. and Esposito (2017) ‘Machine Learning Algorithms for the Forecasting of
Wastewater Quality Indicators’, pp. 1–12. Available at: https://doi.org/10.3390/w9020105.
Huang, J. and Fohrer, N. (2023) ‘A grid-based interpretable machine learning method to understand the
spatial relationships between watershed properties and water quality’, 154(July). Available at:
https://doi.org/10.1016/j.ecolind.2023.110627.
Manuel, C., Pedro, M. and Pires, M. (2021) ‘A Comparison of AutoML Tools for Machine
Learning, Deep Learning and XGBoost’, (October). Available at:
https://doi.org/10.1109/IJCNN52387.2021.9534091.
Miller, T. and Durlik, I. (2023) ‘Applied Sciences Predictive Modeling of Urban Lake Water
Quality Using Machine Learning : A 20-Year Study’.
Pandey, J. and Verma, S. (2022) ‘Water Quality Prediction using Artificial Intelligence and
Machine learning Algorithms’, 71(4), pp. 6114–6132.
Rodr, R., Vogt, M. and Bajorath (2017) ‘Support Vector Machine Classi fi cation and Regression
Prioritize Di ff erent Structural Features for Binary Compound Activity and Potency Value
Prediction’. Available at: https://doi.org/10.1021/acsomega.7b01079.
Shams, M.Y., Elshewey, A.M. and Sayed (2023) ‘Water quality prediction using machine learning
models based on grid search method’, Multimedia Tools and Applications [Preprint],
(0123456789). Available at: https://doi.org/10.1007/s11042-023- 16737-4.
Soori, M., Arezoo, B. and Dastres, R. (2023) ‘Artificial intelligence, machine learning and deep
learning in advanced robotics, a review’, Cognitive Robotics, 3(April), pp. 54–70. Available at:
https://doi.org/10.1016/j.cogr.2023.04.001.
Surucu, O., Andrew, S. and Yawney, J. (2023) ‘Condition Monitoring using Machine Learning : A
Review of Theory, Applications, and Recent Advances’, Expert Systems With Applications,
221(October 2021), p. 119738. Available at: https://doi.org/10.1016/j.eswa.2023.119738.
Tao, H., Majeed, M. and Abdulameer (2022) ‘Neurocomputing Groundwater level prediction using
machine learning models : A comprehensive review’, Neurocomputing, 489, pp. 271–308.
Available at: https://doi.org/10.1016/j.neucom.2022.03.014.
Trentinaglia, M.T., Baldi, L. and Peri, M. (2023) ‘Supporting agriculture in developing countries : new
insights on the impact of official development assistance using a climate perspective’,
Agricultural and Food Economics [Preprint]. Available at:
1438
https://doi.org/10.1186/s40100-023-00282-7.
Union, I.T. (2021) United Nations Activities on Artificial Intelligence (AI).
Zahoor, I. and Mushtaq, A. (2023) ‘Water Pollution from Agricultural Activities : A Critical Global
Review’, 23(1), pp. 164–176.
Zhu, M., Wang, J. and Yang (2022) ‘Eco-Environment & Health A review of the application of
machine learning in water quality evaluation’, Eco-Environment & Health, 1(2), pp. 107–116.
Available at: https://doi.org/10.1016/j.eehl.2022.06.001.
1439
View publication stats