ABSTRACT
ABSTRACT
Various pollutants have been endangering water quality over the past decades. As a
result, predicting and modeling water quality have become essential to minimizing
water pollution. This research has developed a classification algorithm to predict
the water quality classification (WQC). The WQC is classified based on the water
quality index (WQI) from 7 parameters in a dataset using Support Vector Machine
(SVM) and Extreme Gradient Boosting (XGBoost). The results from the proposed
model can accurately classify the water quality based on their features. The
research outcome demonstrated that the XGBoost model performed better, with an
accuracy of 94%, compared to the SVM model, with only a 67% accuracy. Even
better, the XGBoost resulted in only 6% misclassification error compared to SVM,
which had 33%. On top of that, XGBoost also obtained consistent superior results
from 5-fold validation with an average accuracy of 90%, while SVM with an
average accuracy of 64%. Considering the enhanced performance, XGBoost is
concluded to be better at water quality classification.
One of the most valuable natural resources ever given to humans is water. The
ecosystem and human health are directly impacted by the water quality. Water is
used for many different things, including drinking, farming, and industrial uses.
Over the years, numerous pollutants have put water quality in danger. Predicting
and estimating water quality are now crucial to reducing water pollution as a result.
Real-time monitoring is unsuccessful because conventionally, water quality is
assessed using expensive laboratory and statistical processes. Low water quality
calls for a more workable and economical solution. The proposed system builds a
model that can forecast the water quality index and water quality class by utilizing
the advantages of machine learning techniques. This proposed system is to develop
a novel approach for water quality classification using Gradient Boosting
Classifier. The method includes the calculation of the Water Quality Index, which
is used as a measure of water quality. The proposed approach achieves a high Train
Accuracy of 98% and Test Accuracy of 94%. The approach uses various water
quality parameters and features such as pH, dissolved oxygen, temperature, and
electrical conductivity to classify water into different categories. The model
developed in this study is capable of predicting the water quality as Excellent,
Good, Poor and Very Poor, which can be used for real-time monitoring and
management of water quality. The results demonstrate the effectiveness and
accuracy of the proposed approach in predicting water quality, highlighting the
potential of machine learning techniques for water quality monitoring and
management. The proposed approach can be used in various applications such as
water treatment, environmental monitoring, and aquatic life management.