Water Quality Prediction Using Artificial Intellig
Water Quality Prediction Using Artificial Intellig
Research Article
Water Quality Prediction Using Artificial Intelligence Algorithms
Received 29 November 2020; Revised 12 December 2020; Accepted 16 December 2020; Published 30 December 2020
Copyright © 2020 Theyazn H. H Aldhyani et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
During the last years, water quality has been threatened by various pollutants. Therefore, modeling and predicting water quality
have become very important in controlling water pollution. In this work, advanced artificial intelligence (AI) algorithms are
developed to predict water quality index (WQI) and water quality classification (WQC). For the WQI prediction, artificial
neural network models, namely nonlinear autoregressive neural network (NARNET) and long short-term memory (LSTM) deep
learning algorithm, have been developed. In addition, three machine learning algorithms, namely, support vector machine
(SVM), K-nearest neighbor (K-NN), and Naive Bayes, have been used for the WQC forecasting. The used dataset has 7
significant parameters, and the developed models were evaluated based on some statistical parameters. The results revealed that
the proposed models can accurately predict WQI and classify the water quality according to superior robustness. Prediction
results demonstrated that the NARNET model performed slightly better than the LSTM for the prediction of the WQI values
and the SVM algorithm has achieved the highest accuracy (97.01%) for the WQC prediction. Furthermore, the NARNET and
LSTM models have achieved similar accuracy for the testing phase with a slight difference in the regression coefficient
(RNARNET = 96:17% and RLSTM = 94:21%). This kind of promising research can contribute significantly to water management.
1. Introduction resources of fresh water, such as ground and surface water, are
natural water resources. However, such resources can be pol-
Water is the most significant resource of life, crucial for sup- luted by human/industrial activities and other natural processes.
porting the life of most existing creatures and human beings. Hence, rapid industrial development has prompted the
Living organisms need water with enough quality to continue decay of water quality at a disturbing rate. Furthermore, infra-
their lives. There are certain limits of pollutions that water structures, with the absence of public awareness, and less
species can tolerate. Exceeding these limits affects the exis- hygienic qualities, significantly affect the quality of drinking
tence of these creatures and threatens their lives. water [1]. In fact, the consequences of polluted drinking water
Most ambient water bodies such as rivers, lakes, and streams are so dangerous and can badly affect health, the environment,
have specific quality standards that indicate their quality. More- and infrastructures. As per the United Nations (UN) report,
over, water specifications for other applications/usages possess about 1.5 million people die each year because of contaminated
their standards. For example, irrigation water must be neither water-driven diseases. In developing countries, it is announced
too saline nor contain toxic materials that can be transferred that 80% of health problems are caused by contaminated water.
to plants or soil and thus destroying the ecosystems. Water qual- Five million deaths and 2.5 billion illnesses are reported annu-
ity for industrial uses also requires different properties based on ally [2]. Such a mortality rate is higher than deaths resulting
the specific industrial processes. Some of the low-priced from accidents, crimes, and terrorist attacks [3].
2 Applied Bionics and Biomechanics
Therefore, it is very important to suggest new approaches reported to demonstrate more accurate results compared
to analyze and, if possible, to predict the water quality (WQ). with supervised learning-based techniques. Li et al. [22]
It is recommended to consider the temporal dimension for developed a novel hybrid model using a neural network
forecasting the WQ patterns to ensure the monitoring of and the Markov chain method. This model has helped in pre-
the seasonal change of the WQ [4]. However, using a special dicting dissolved oxygen, a primary measure of the WQ [23].
variation of models together to predict the WQ grants better Khan and See [24] included dissolved oxygen, chlorophyll,
results than using a single model [5–7]. There are several conductivity, and turbidity in the developed WQ model
methodologies proposed for the prediction and modeling of using an artificial neural network (ANN). Yan et al. [25] sug-
the WQ. These methodologies include statistical approaches, gested a genetic algorithm (GA) and particle swarm optimi-
visual modeling, analyzing algorithms, and predictive algo- zation (PSO) algorithm to enhance the backpropagation
rithms. For the sake of the determination of the correlation (BP) neural network to predict the oxygen demanded in a
and relationship among different water quality parameters, lake. An enhanced accuracy of the prediction results was
multivariate statistical techniques have been employed [4]. reported.
The geostatistical approaches were used for transitional prob- Several studies have been performed to model and pre-
ability, multivariate interpolation, and regression analysis [5]. dict the water quality using different ANN models. These
Massive increases in population, the industrial revolu- studies have approved the feasibility and effectiveness of
tion, and the use of fertilizers and pesticides have led to seri- employing ANN applications to predict the quality of drink-
ous effects on the WQ environments [8, 9]. Thus, having ing water.
models for the prediction of the WQ is of great help for mon- Currently, researchers mostly emphasize enhancing the
itoring water contamination. applicability and reliability of water quality prediction/mo-
Currently, two main types for modeling and predicting delling by using a variety of new technologies such as Fuzzy
water quality are available: mechanism- and non- logic, stochastic, ANN, and deep learning [26, 27].
mechanism-oriented models. The mechanism model is rela- Shafi et al. [28] proposed four machine learning algo-
tively sophisticated; it uses the advanced system structure rithms, namely, Support Vector Machines (SVM), Neural
data for simulating the WQ, and thus, it is considered as a Networks (NN), Deep Neural Networks, and k-Nearest
multifunctional model that can be used for any water body. Neighbors (kNN), for the prediction of water quality. Using
In addition, the Streeter–Phelos (S–P) model, one of the ear- single feed-forward neural networks to classify water quality,
liest WQ simulation model, has been used widely. 25 parameters have been included as input parameters [29].
Later, some countries have developed a variety of WQ Ranković et al. [30] estimated the dissolved oxygen (DO)
models including the QUAL model [10] and the WASP by employing the ANN model. Gazzaz et al. [31] estimated
model [11], which have gained wide usage in mimicking the WQI by using an ANN model, and the Internet of Things
the water quality of rivers. This was followed by Warren (IOT) technology was applied to collect the dataset from
and Bach [12] who suggested to use MIKE21 for designing water resources. Abyaneh [32] has applied the machine
systems to model the estuaries, coastal waters, and seas. learning approaches like ANN and regression to predict the
Hayes et al. [13] have paired two models for improving chemical oxygen demand (COD). Sakizadeh [33] used
the quality of downstream water, namely, quasi-static two- ANN with Bayesian regularization to estimate the water
dimensional dissolved oxygen reservoir model (DORM-II) quality index (WQI). However, the radial-basis-function
and a daily scale optimal dispatch model. (RBF), a type of the ANN model, was used for the prediction
Using environmental fluid dynamics code (EFDC), a and classification of water quality [34, 35].
two-dimensional numerical model was developed to simulate In addition, it has been reported that deep learning
the water environment of the Mudan River [14]. This is based methods showed high performance in predicting the WQ
on the distance between points and intervals [15]. when compared to the traditional methods. Marir et al. [36]
Another study was conducted by Batur and Maktav [16] developed a model to find out the uncommon behavior from
to predict the WQ of Lake Gala (Turkey) using satellite image large-scale network traffic data. While a deep learning algo-
fusion based on the principal component analysis (PCA) rithm was employed for extracting features, a multilayer
method. Jaloree et al. [17] have attempted to predict the ensemble support vector machine model was used for classi-
WQ of the Narmada River with five WQ indicators using a fication. Fadlullah et al. [37] visualized a reward-based deep
decision tree model. Another study suggested the use of the learning structure combining a deep convolutional neural
deep Bidirectional Stacked Simple Recurrent Unit (Bi-S-SRU) network and a deep belief network.
[18] for the designing of a precise forecasting scheme of the For the analysis and prediction of the WQ of groundwater,
WQ in smart mariculture. different algorithms including ANN, Bayesian neural networks,
Liao and Sun [19] developed a model to forecast the WQ adaptive neurofuzzy [38], decision support system (DSS), and
of China’s Chao Lake by pairing the ANN and decision tree autoregressive moving average (ARMA) have been applied
algorithm. Yan and Qian [20] proposed an affinity propaga- [39]. However, these mimicking models have some limitations.
tion clustering model based on a least-squares support vector However, the contributions of the current study can be
machine (AP-LSSVM). This model is highly sensitive to summarized as follows:
vacancies. Solanki et al. [21] analyzed and predicted the
chemical eigenvalues of water, especially dissolved oxygen (i) Developing highly efficient advanced artificial intelli-
and pH using the deep learning network model which was gence models to predict the water quality index
Applied Bionics and Biomechanics 3
Dataset
Evaluation metrics MSE,
Prediction (WQI) RMSE and R
Deep learning and artificial neural network models
Accuracy, sensitivity
Data exploration Classification (WQC) specificity and precision
(WQI) based on artificial neural networks and deep WQI has been calculated using the following formula:
learning algorithms
(ii) Applying some machine learning models, namely, ∑Ni=1 qi × wi
WQI = , ð1Þ
support vector machine (SVM), K-nearest neighbour ∑Ni=1 wi
(K-NN), and Naive Bayes algorithms, for the predic-
tion of water quality classification (WQC). where: N is the total number of parameters included in the
The highly efficient developed models can be generalized WQI calculationsqi is the quality rating scale for each param-
and used to forecast the water pollution process which will eter i calculated by equation (2) below, and wi is the unit
help the decision-makers to make the right decisions at the weight for each parameter calculated by equation (3).
right time.
V i − V Ideal
qi = 100 × , ð2Þ
Si − V Ideal
2. Materials and Methods
Figure 1 displays the proposed methodology of the present where:V i is the measured value of parameter i in the tested
study. water samplesV Ideal is the ideal value of parameter i in pure
water (0 for all parameters except DO = 14:6 mg/l and
pH = 7:0), and Si is the recommended standard value of
2.1. Dataset. The dataset used in this study is collected from parameter i (as shown in Table 1).
certain historical locations in India. It contained 1679 sam-
ples from different Indian states during the period from K
2005 to 2014. The dataset has 7 significant parameters, wi = , ð3Þ
Si
namely, dissolved oxygen (DO), pH, conductivity, biological
oxygen demand (BOD), nitrate, fecal coliform, and total coli-
where K is the proportionality constant that can be calcu-
form. Data was collected by the Indian government to ensure
lated as follows:
the quality of the supplied drinking water. This dataset was
obtained from Kaggle https://www.kaggle.com/anbarivan/
indian-water-quality-data. 1
K= , ð4Þ
∑Ni=1 Si
2.2. Data Preprocessing. The processing phase is very impor-
tant in data analysis to improve the data quality. In this Tables 2 and 3 represent the unit weight of each
phase, the WQI has been calculated from the most significant parameter and the WQC, respectively.
parameters of the dataset. Then, water samples have been 2.2.2. Z-Score Normalization Method. Normalization is a way
classified on the basis of the WQI values. For obtaining supe- to simplify calculations. It is a dimensional expression trans-
rior accuracy, the z-score method has been used as a data formed into a nondimensional expression and becomes a
normalization technique. scalar. Z-score normalization (or normalization score) is a
normalization method used to normalize parameters by
2.2.1. Water Quality Index Calculation. To measure water using the mean (μ) and standard deviation (σ) values of the
quality, WQI is used to be calculated using various parame- tested data. It can be calculated as follows:
ters that significantly affect WQ [40–42]. In this study, a pub-
lished dataset is considered to test the proposed model, and ðx − μÞ
Z‐score = , ð5Þ
seven significant water quality parameters are included. The σ
4 Applied Bionics and Biomechanics
where x is the measured value of the parameter i in the tested yðt Þ = hðyðt − 1Þ, yðt − 2Þ, ⋯, yðt − pÞÞ + ϵ ðt Þ, ð6Þ
sample.
where y is the value of time-series data at time t and yðtÞ
2.3. Prediction of Water Quality Index. For this purpose, for employing the p observation values of the series. The
ANN models, namely, nonlinear autoregressive neural net- function ðhÞ is used to optimize the network weights and
work (NARNET) and long short-term memory (LSTM) deep neuron bias. Finally, the ϵðtÞ is the error obtained from the
learning algorithm, were used for the prediction of water model at time t:
quality index. In this work, the NARNET model has been developed to
predict the WQI. The NARNET model is a time series model
2.3.1. Artificial Neural Network (ANN) Model. In general, the that is used to predict the stationary time series compared with
neural network (NN) models are used as very powerful other ANN models like the forward neural network model. The
machine learning algorithms for time-series prediction of WQI parameters seem in the form of time series; therefore, the
different engineering applications. The ANN model has con- NARNET model is proposed to predict the WQI. Table 4 shows
sisted of an input layer, a hidden layer/s, and an output layer. the significant parameters of the developed model. Figure 3 rep-
Each hidden layer has weight and bias parameters to manage resents the topology of the developed NARNET model.
neurons. To transfer the data from the hidden layer into the
output layer, the activation function is used. The learning 2.3.2. Deep Neural Network (DNN) Model. The DNN model
algorithms are used to select the weights within the NN is one type of feedforward NN algorithms, which is a funda-
framework. The weight selection is based on the minimum mental technique for deep learning. DNN consists of 3 levels
performance measures such as mean square error (MSE). of nodes, and each node follows a nonlinear function, except
The NARNET model is a very popular multilayer feed- for the input node. DNN presents a technique of backpropa-
forward network. It starts with a guessed initial weight value, gation supervised learning. In this work, a WQI model was
which is then updated using the actual data. Consequently, developed using the DNN algorithm and the simple DNN
there is some sort of randomness in the prediction process was compared with the proposed model. This model includes
Applied Bionics and Biomechanics 5
Hidden Output
y(t) y(t)
1:8 W W
+ +
b b
1 1
12 1
h(t–1)
Input: Does x(t) matter? U(o)
h(t–1) 𝜎 x(t)
U(i) W(o)
𝜎 o(t)
x(t)
W (i)
i(t)
c(t–1)
500
Delays [1 3 4 7]
400
Maximum number of iterations 1500
300
Maximum number of epochs 150
200
100
Table 6: Performances of the NARNET LSTM and ANN models to
0
predict WQI.
–36.01
–30.2
–24.4
–18.59
–12.78
–6.972
–1.165
4.643
10.45
16.26
22.07
27.87
33.68
39.49
45.3
51.11
56.91
62.72
68.53
74.34
Training data set Testing data
Models Errors = Targets - Outputs
MSE R (%) MSE R (%)
NARNET 0.2815 95.97 0.1353 96.17 Training Test
Validation Zero error
LSTM 0.1316 93.93 0.1028 94.21
Figure 6: Histogram error of the NARNET model.
The best hyperplane is the line with the largest margin, which is 2.4.2. K-Nearest Neighbor (K-NN) Model. The K-NN algo-
meant the distance between the hyperplane and the nearest rithm is a basic classification and regression method. It is
input objects. The input points defined in the hyperplane are used to find the K values that are close to values in the train-
called support vectors. In this work, the linear SVM model along ing dataset. Most of these values belong to a certain class, and
with the Gaussian radial basis function (equation (17)) is used thus, tested data can be classified. The K value is used to find
to classify the tested water samples based on their quality. the closest points in the feature vectors, and the value should
be unique. The following expression of the Euclidean dis-
!
X − X ′ 2 tance function (Di) can be used.
K X, X ′ = exp − , ð17Þ
2σ2
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Di = ðx1 − x2 Þ + ðy1 − y2 Þ2 , ð18Þ
where X and X ′ represent the feature vectors of the input data-
2
set and the kX − X ′ k is the squared Euclidean distance
between the two feature inputs. The σ is a free parameter. where x1 , x2 , y1 , and y2 are the variables for input data.
Applied Bionics and Biomechanics 7
Train data
Error StD = 0.36213
2 200
1 150
Errors
0 100
–1 50
–2 0
0 500 1000 –1 0 1 2
Test data
Error StD = 0.31957
2 150
1
100
Errors
0
50
–1
–2 0
0 200 400 600 –2 –1 0 1 2
Figure 7: Histogram error and mean error of the LSTM model in the training and testing phases.
2.4.3. Naive Bayes Model. The Bayesian method uses the and F-score evaluation matrices were employed to evaluate
knowledge of probability statistics to predict and classify the developed classification model to predict the WQC. The
datasets. The Bayesian algorithm combines prior and poste- used statistical parameters were defined as follows:
rior probabilities to avoid the supervisor’s bias and the over-
fitting phenomenon of using sample information alone. (a) Mean Square Error (MSE)
This Naive Bayes is a type of classification algorithms
based on Bayes’ theorem and the assumption of the indepen-
dence of characteristic conditions. Attributes are assumed to
1 N
be conditionally independent of each other when the target MSE = 〠 ðy − y∧i Þ2 , ð21Þ
value is given. This method greatly simplifies the complexity N i=1 i
of the Bayesian method.
In Bayesian analysis, the probability of an event A given
an event B is not the same as the probability of B given A where yi and ̂yi are the predicted and the observed responses,
as in equation (18). respectively, and N is the total number of variables.
(b) Accuracy
P ðA ∣ B Þ ≠ P ðB ∣ A Þ : ð19Þ
P ðC Þ × P ð A ∣ C Þ
P ðC ∣ AÞ = , ð20Þ
P ðA Þ (c) Specificity
250 250
Output ~ = 0.92+Target + 5.3
150 150
100 100
50 50
Data Data
Fit Fit
Y=T Y=T
All: R = 0.95926
250
Output ~ = 0.93+Target + 5
200
150
100
50
Data
Fit
Y=T
TP where TP, TN, FP, and FN are the true positive, true nega-
Sensitivity = × 100%, ð24Þ tive, false positive, and false negative, respectively.
TP + FN
2.6. Correlation Analysis. Pearson’s correlation coefficient
approach is applied to analyze the correlation between the
(e) Precision significant parameters of the dataset used for the prediction
of the QWI values.
TP
Precision = × 100%, ð25Þ n∑ðx × yÞ − ðΣxÞðΣyÞ
TP + FP R= × 100%, ð27Þ
½n∑ðx2 Þ−∑ðx2 Þ × ½n∑ðy2 Þ−∑ðy2 Þ
3 3
2 2
1 1
0 0
–1 –1
0 2 4 0 2 4
Target Target
Data Data
Fit Fit
Y=T Y=T
All Data : R = 0.94004
Output ~ = 0.95+Target + 0.027
5
4
0
–1
0 2 4
Target
Data
Fit
Y=T
2.7. Experimental Setup. The prediction experiments have 3. Results and Discussion
been conducted in a specific environment (MATLAB
2018). The simulation has been performed using a sys- For validating the developed model, the dataset has been
tem with i5 Processor and 4 GB RAM to process all divided into 70% training and 30% testing subsets. While
required tasks. the ANN and LSTM models were used to predict the WQI,
10 Applied Bionics and Biomechanics
Models Accuracy (%) Sensitivity (%) Specificity (%) Precision (%) F-score (%)
SVM 97.01 99.23 97.78 94.93 98.54
KNN 83.63 84.73 94.93 87.50 85.84
Naive Bayes 75.20 77.76 91.65 78.08 81.51
KNN
Model
SVM
Naive bayes
0 10 20 30 40 50 60 70 80 90 100
(%)
F-score (%)
Figure 10: Performance of the machine learning algorithms used for the prediction of the WQC.
the SVM, KNN, and Naive Bayes were utilized for the water dataset, whereas the “output” is the predicted values obtained
quality classification prediction. from the NARNET and LSTM models. As shown in both fig-
ures, there is a clear good agreement (R > 95:7% (NARNET)
3.1. Prediction of the WQI. A NARNET model, with 12 hidden and R > 93:3% (LSMT)) between the predicted WQI values
layers, showed a good performance to predict the WQI values. and the ones calculated from the measured parameters. This
As presented earlier, it has the following characteristics: 1 : 8 implies the highly efficient performance of both developed
number of delays and 12 number of epochs. However, the devel- models.
oped LSTM model has a total number of 200 hidden layers,150 Table 7 summarizes the Pearson’s correlation coefficient
maximum number of epochs, and delays of [1, 3, 4, 7]. approach is used to predict the WQI values. The correlation
Table 6 summarizes the performance parameters of the between the WQI parameters for selecting the optimal
developed models to predict WQI, although the prediction parameters has been obtained. Results revealed that all
accuracy of LSTM for the testing data was slightly better than parameters have a strong relationship with WQI parameters.
that for the training data. In addition, the LSTM model, in This indicates that these parameters are very important for
general, has shown a slightly better performance compared predicting the quality of water.
with the NARNET model according to the MSE values. How-
ever, based on the R value, the NARNET model has shown a 3.2. Prediction of the Water Quality Classification. This sec-
better performance. In general, both models demonstrated tion presents the results of the classification algorithms are used
an excellent prediction of the WQI values with R% > 93:93. to predict the WQC. Table 8 shows the results of the used
Figure 6 illustrate the histogram error of the NARNET machine learning algorithms. It is noted that the performance
model. The histogram metric is used to find errors between of the SVM algorithm is very superior as compared to the
the target values and the predicted values of training and test- KNN and Naive Bayes models. However, the Naive Bayes algo-
ing datasets. The total error range is divided into 20 smaller rithm has shown the poorest performance. Figure 10 shows the
bins, where the y-axis refers to the number of samples located performance of the used algorithms to predict the WQC.
in a particular bin. Figure 7 displays the histogram metric
and mean errors of the LSTM model in the training and test- 4. Conclusions
ing phases. The mean error and histogram metric are used to
find the deviation between the observation values and the Modeling and prediction of water quality are very important
predicted values of training and testing. for the protection of the environment. Developing a model
Figures 8 and 9 display the regression plots for the pre- by using advanced artificial intelligence algorithms can be
dicted values of training, testing, and whole datasets for the used to measure the future water quality. In this proposed
NARNET and LSTM models, respectively. This plot is used methodology, the advanced artificial intelligence algorithms,
to find the relationship between the predicted values and namely, NARNET and LSTM models were used to predict
actual values. The “target” values in the plot are the actual the WQI. Moreover, machine learning algorithms such as
Applied Bionics and Biomechanics 11
SVM, KNN, and Naive Bayes were used to classify the WQI [4] K. Farrell-Poe, W. Payne, and R. Emanuel, Water Quality &
data. The proposed models were evaluated and examined Monitoring, University of Arizona Repository, 2000, http://
by some statistical parameters. For the WQI prediction, the hdl.handle.net/10150/146901.
result has revealed that the performance of the NARNET [5] T. Taskaya-Temizel and M. C. Casey, “A comparative study of
model is slightly better than the LSTM model based on the autoregressive neural network hybrids,” Neural Networks,
obtained R value. However, the SVM algorithm has achieved vol. 18, no. 5–6, pp. 781–789, 2005.
the highest accuracy of the prediction of the WQC as com- [6] C. N. Babu and B. E. Reddy, “A moving-average filter based
pared with KNN and Naive Bayes algorithms. After examin- hybrid ARIMA-ANN model for forecasting time series data,”
ing the robustness and efficiency of the proposed model for Applied Soft Computing, vol. 23, pp. 27–38, 2014.
predicting the WQI, in future work, the developed models [7] X. Zhang, N. Hu, Z. Cheng, and H. Zhong, “Vibration data
will be implemented to predict the water quality in Saudi recovery based on compressed sensing,” Acta Physica Sinica,
Arabia for different types of water. vol. 63, no. 20, pp. 119–128, 2014.
[8] M. M. S. Cabral Pinto, C. M. Ordens, M. T. Condesso de Melo
et al., “An inter-disciplinary approach to evaluate human
Data Availability health risks due to long-term exposure to contaminated
groundwater near a chemical complex,” Exposure and Health,
The dataset used in this study is collected from certain histor- vol. 12, no. 2, pp. 199–214, 2020.
ical locations in India. It contained 1679 samples from differ- [9] M. M. S. Cabral Pinto, A. P. Marinho-Reis, A. Almeida et al.,
ent Indian states during the period from 2005 to 2014. The “Human predisposition to cognitive impairment and its rela-
dataset has 7 significant parameters named dissolved oxygen tion with environmental exposure to potentially toxic ele-
(DO), pH, conductivity, biological oxygen demand (BOD), ments,” Environmental Geochemistry and Health, vol. 40,
nitrate, fecal coliform, and total coliform. The data was col- no. 5, pp. 1767–1784, 2018.
lected by the Indian government to ensure the quality of [10] Y. C. Lai, C. P. Yang, C. Y. Hsieh, C. Y. Wu, and C. M. Kao,
the supplied drinking water. This dataset was obtained from “Evaluation of non-point source pollution and river water
Kaggle https://www.kaggle.com/anbarivan/indian-water- quality using a multimedia two-model system,” Journal of
quality-data. Hydrology, vol. 409, no. 3-4, pp. 583–595, 2011.
[11] J. Huang, N. Liu, M. Wang, and K. Yan, “Application WASP
model on validation of reservoir-drinking water source protec-
Conflicts of Interest tion areas delineation,” in 2010 3rd International Conference
The authors declare no conflict of interest. on Biomedical Engineering and Informatics, pp. 3031–3035,
Yantai, China, October 2010.
[12] I. R. Warren and H. K. Bach, “MIKE 21: a modelling system
Authors’ Contributions for estuaries, coastal waters and seas,” Environmental Software,
vol. 7, no. 4, pp. 229–240, 1992.
All authors contributed significantly to the completion of this
[13] D. F. Hayes, J. W. Labadie, T. G. Sanders, and J. K. Brown,
article. “Enhancing water quality in hydropower system operations,”
Water Resources Research, vol. 34, no. 3, pp. 471–483, 1998.
Acknowledgments [14] G. Tang, J. Li, Z. Zhu, Z. Li, and F. Nerry, “Two-dimensional
water environment numerical simulation research based on
The authors extend their appreciation to the Deputyship for EFDC in Mudan River, Northeast China,” in 2015 IEEE Euro-
Research & Innovation, Ministry of Education in Saudi Ara- pean Modelling Symposium (EMS), pp. 238–243, Madrid,
bia for funding this research work through the project num- Spain, October 2015.
ber IFT20111. [15] L. Hu, C. Zhang, C. Hu, and G. Jiang, “Use of grey system for
assessment of drinking water quality: a case S study of Jiaozuo
city, China,” in 2009 IEEE International Conference on Grey
References Systems and Intelligent Services (GSIS 2009), pp. 803–808,
Nanjing, China, November 2009.
[1] P. Zeilhofer, L. V. A. C. Zeilhofer, E. L. Hardoim, Z. M. . Lima,
and C. S. Oliveira, “GIS applications for mapping and spatial [16] E. Batur and D. Maktav, “Assessment of surface water quality
modeling of urban-use water quality: a case study in District by using satellite images fusion based on PCA method in the
of Cuiabá, Mato Grosso, Brazil,” Cadernos de Saúde Pública, Lake Gala, Turkey,” IEEE Transactions on Geoscience and
vol. 23, no. 4, pp. 875–884, 2007. Remote Sensing, vol. 57, no. 5, pp. 2983–2989, 2019.
[2] M. A. Kahlown, M. A. Tahir, and H. Rasheed, National Water [17] S. Jaloree, A. Rajput, and G. Sanjeev, “Decision tree approach
Quality Monitoring Programme, Fifth Monitoring Report to build a model for water quality,” Binary Journal of Data
(2005–2006), Pakistan Council of Research in Water Resources Mining & Networking, vol. 4, pp. 25–28, 2014.
Islamabad, Islamabad, Pakistan, 2007, http://www.pcrwr.gov [18] J. Liu, C. Yu, Z. Hu et al., “Accurate prediction scheme of water
.pk/Publications/Water%20Quality%20Reports/Water% quality in smart mariculture with deep Bi-S-SRU learning net-
20Quality%20Monitoring%20Report%202005-06.pdf. work,” IEEE Access, vol. 8, pp. 24784–24798, 2020.
[3] UN water, “Clean water for a healthy world,” Development, [19] H. Liao and W. Sun, “Forecasting and evaluating water qual-
2010, https://www.undp.org/content/undp/en/home/ ity of Chao Lake based on an improved decision tree
presscenter/articles/2010/03/22/clean-water-for-a-healthy- method,” Procedia Environmental Sciences, vol. 2, pp. 970–
world.html. 979, 2010.
12 Applied Bionics and Biomechanics
[20] L. Yan and M. Qian, “AP-LSSVM modeling for water quality Turkey,” Environmental Earth Sciences, vol. 56, no. 1, pp. 19–
prediction,” in Proceedings of the 31st Chinese Control Confer- 25, 2008.
ence, pp. 6928–6932, Hefei, China, July 2012. [35] M. Bouamar and M. Ladjal, “A comparative study of RBF neu-
[21] A. Solanki, H. Agrawal, and K. Khare, “Predictive analysis of ral network and SVM classification techniques performed on
water quality parameters using deep learning,” International real data for drinking water quality,” in 2008 5th International
Journal of Computers and Applications, vol. 125, no. 9, Multi-Conference on Systems, Signals and Devices, pp. 1–5,
pp. 29–34, 2015. Amman, Jordan, July 2008.
[22] X. Li and J. Song, “A new ANN-Markov chain methodology [36] N. Marir, H. Wang, G. Feng, B. Li, and M. Jia, “Distributed
for water quality prediction,” in 2015 International Joint Con- abnormal behavior detection approach based on deep belief
ference on Neural Networks (IJCNN), pp. 1–6, Killarney, Ire- network and ensemble SVM using spark,” IEEE Access,
land, July 2015. vol. 6, pp. 59657–59671, 2018.
[23] A. A. M. Ahmed and S. M. A. Shah, “Application of adaptive [37] Z. M. Fadlullah, F. Tang, B. Mao, J. Liu, and N. Kato, “On intel-
neuro-fuzzy inference system (ANFIS) to estimate the bio- ligent traffic control for large-scale heterogeneous networks: a
chemical oxygen demand (BOD) of Surma River,” Journal of value matrix-based deep learning approach,” IEEE Communi-
King Saud University - Engineering Sciences, vol. 29, no. 3, cations Letters, vol. 22, no. 12, pp. 2479–2482, 2018.
pp. 237–243, 2017. [38] S. Maiti and R. K. Tiwari, “A comparative study of artificial
[24] Y. Khan and C. S. See, “Predicting and analyzing water quality neural networks, Bayesian neural networks and adaptive
using Machine Learning: a comprehensive model,” in 2016 neuro-fuzzy inference system in groundwater level predic-
IEEE Long Island Systems, Applications and Technology Con- tion,” Environmental Earth Sciences, vol. 71, no. 7, pp. 3147–
ference (LISAT), pp. 1–6, Farmingdale, NY, USA, April 2016. 3160, 2014.
[25] J. Yan, Z. Xu, Y. Yu, H. Xu, and K. Gao, “Application of a [39] C. Min, “An improved recurrent support vector regression
hybrid optimized BP network model to estimate water quality algorithm for water quality prediction,” Journal of Computa-
parameters of Beihai Lake in Beijing,” Applied Sciences, vol. 9, tional Information, vol. 12, pp. 4455–4462, 2011.
no. 9, p. 1863, 2019. [40] R. Das Kangabam, S. D. Bhoominathan, S. Kanagaraj, and
[26] H. R. Maier, A. Jain, G. C. Dandy, and K. P. Sudheer, “Methods M. Govindaraju, “Development of a water quality index
used for the development of neural networks for the prediction (WQI) for the Loktak Lake in India,” Applied Water Science,
of water resource variables in river systems: current status and vol. 7, no. 6, pp. 2907–2918, 2017.
future directions,” Environmental Modelling & Software, [41] G. Srivastava and P. Kumar, “Water quality index with missing
vol. 25, no. 8, pp. 891–909, 2010. parameters,” International Journal of Research in Engineering
[27] S. Lee and D. Lee, “Improved prediction of harmful algal and Technology, vol. 2, no. 4, pp. 609–614, 2013.
blooms in four major South Korea’s rivers using deep learning [42] S. Tyagi, B. Sharma, P. Singh, and R. Dobhal, “Water quality
models,” International Journal of Environmental Research and assessment in terms of water quality index,” American Journal
Public Health, vol. 15, no. 7, p. 1322, 2018. of Water Resources, vol. 1, no. 3, pp. 34–38, 2013.
[28] U. Shafi, R. Mumtaz, H. Anwar, A. M. Qamar, and
[43] A. A. Al-Othman, “Evaluation of the suitability of surface
H. Khurshid, “Surface water pollution detection using internet
water from Riyadh Mainstream Saudi Arabia for a variety of
of things,” in 2018 15th International Conference on Smart Cit-
uses,” Arabian Journal of Chemistry, vol. 12, no. 8, pp. 2104–
ies: Improving Quality of Life Using ICT & IoT (HONET-ICT),
2110, 2019.
pp. 92–96, Islamabad, Pakistan, October 2018.
[44] T. H. H. Aldhyani, M. Alrasheedi, A. A. Alqarni, M. Y. Alzah-
[29] Z. Ahmad, N. A. Rahim, A. Bahadori, and J. Zhang, “Improv-
rani, and A. M. Bamhdi, “Intelligent hybrid model to enhance
ing water quality index prediction in Perak River basin Malay-
time series models for predicting network traffic,” IEEE Access,
sia through a combination of multiple neural networks,”
vol. 8, pp. 130431–130451, 2020.
International Journal of River Basin Management, vol. 15,
no. 1, pp. 79–87, 2016.
[30] V. Ranković, J. Radulović, I. Radojević, A. Ostojić, and
L. Čomić, “Neural network modeling of dissolved oxygen in
the Gruža reservoir, Serbia,” Ecological Modelling, vol. 221,
no. 8, pp. 1239–1244, 2010.
[31] N. M. Gazzaz, M. K. Yusoff, A. Z. Aris, H. Juahir, and M. F.
Ramli, “Artificial neural network modeling of the water quality
index for Kinta River (Malaysia) using water quality variables
as predictors,” Marine Pollution Bulletin, vol. 64, no. 11,
pp. 2409–2420, 2012.
[32] H. Z. Abyaneh, “Evaluation of multivariate linear regression
and artificial neural networks in prediction of water quality
parameters,” Journal of Environmental Health Science and
Engineering, vol. 12, no. 1, p. 40, 2014.
[33] M. Sakizadeh, “Artificial intelligence for the prediction of
water quality index in groundwater systems,” Modeling Earth
Systems and Environment, vol. 2, no. 1, p. 8, 2016.
[34] M. I. Yesilnacar, E. Sahinkaya, M. Naz, and B. Ozkaya, “Neural
network prediction of nitrate in groundwater of Harran Plain,