0% found this document useful (0 votes)
7 views6 pages

IEEE3

Uploaded by

Manasa P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views6 pages

IEEE3

Uploaded by

Manasa P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Dissolved Oxygen (DO) Meter Hydrological

Modelling Using Predictive Algorithms


Aljay R. Lorenzo1, Allysa Y. Dula2, Neil Aldrin C. Valeroso3, John David C. Munda4, Brian Noli G. Supang5,
Maria Victoria C. Padilla6, Gilfred Allen M. Madrigal7, Timothy M. Amado8, Lean Karlo S. Tolentino9
Electronics Engineering Department, Technological University of the Philippines
Ermita, Manila, Philippines
[email protected], [email protected], [email protected],

[email protected],[email protected] 6789

Abstract— Dissolved oxygen is one of the critical indicators ensure its capacity to support aquatic life. The DO level,
of a body of water’s health and water quality. It refers to however, depends on many factors such as temperature,
the presence of free, non-compound oxygen found in water. salinity, oxygen depletion, oxygen source, and others [2].
It also influences the growth and survival of the aquatic Moreover, DO measurement can be done through the use of
organisms living in it. This study aims to develop a low- one of the following: dissolved oxygen, multi-parameter
cost, multi-function device that could determine the value measuring device, or laboratory testing or the Wrinkler
of the dissolved oxygen (DO) level through hydrological method. However, prices for DO meters prove to be very
modelling of water parameters such as temperature, pH, costly, especially to the small farmers, price is between Php
and conductivity using Decision Tree, Decision forest, and 20,000 up to half a million pesos or $ 384 to $ 9,600.00 [3]
Multi-layer Perceptron machine learning algorithms. [15].
Using various metrics, the most efficient model was built If one, however, wishes to measure the DO level cost-
using Random Forest algorithm, for it yielded the most effectively, the trade-off would be the process being labor-
reliable metrics when compared to the other two intensive. DO levels are typically and traditionally measured
algorithms. The evaluated model has the following metrics: by means of the Wrinkler method [4], where titration is used
The Coefficient of Determination, or how well a model to account for DO in a given water sample. The process uses
explains and predicts future outcomes, is 0.99. The Mean a total of five reagents such as Sodium thiosulfate, Manganese
Absolute Error, or the average magnitude of the errors in sulfate, alkali-iodide-azide, concentrated sulfuric acid, and
a set of predictions, is 0.32. The Mean Squared Error, starch solution. Aside from being labor-intensive, the method
evidently requires reagents that the common aquaculture
utilized in order to measure the performance of an
farmer cannot simply obtain.
estimator, is 0.36. The Root Mean Squared Error, or how
These lead to the evaluation of the dissolved oxygen level
concentrated the data is around the line of best fit, is 0.60.
of a certain body of water by developing hydrological models
Relative to Atlas Scientific’s DO Sensor, the device can
using certain parameters, namely: temperature, pH Level and
predict the dissolved oxygen level of a given water pond
conductivity, through a comparative study between various
with 2.61% error. The final device is a handheld device
machine learning [11],[12], algorithms specifically (a)
consisting of the sensors for the highest- ranking
Decision Tree Regression (DTR); (b) Random Forest (RFR);
parameters with respect to their relationship to DO:
and (c) Multilayer Perceptron (MLP).
temperature, conductivity, and pH.
II. METHODOLOGY
Keywords — water parameters, dissolved oxygen, Python,
machine learning, hydrological modelling, water meter The study mostly relied on the physical construction of a
buoy for the data acquisition. From the construction of the
I. INTRODUCTION buoy, the data was analyzed and resulted to a practical and
cost-effective solution for the development of the DO meter.
Water, easily the most ubiquitous resource, has numerous After the development of the final device, as shown in Fig.1,
parameters with different implications regarding its state. One it was tested for accuracy. Using Arduino Mega
of the most important ones is the dissolved oxygen level. (ATMega2560) as microcontroller, the buoy recorded
Knowing the dissolved oxygen level of a fish-filled body of measurements from the sensors and saved it in a 4GB micro
water is crucial; especially in the case of fish farmers, this SD card.
would dictate their livelihood. The block diagram of this study is presented in Fig. 2. of
Dissolved oxygen (DO) is the most critical indicator of a Preparation of the gathered data was first done followed by the
body of water’s health and water quality [1]. The amount of modelling process where three methods of predictive
DO present in a body of water influences the growth and algorithms were tested. Finally the evaluation of the result was
survival of the aquatic organisms living in it. It is highly performed.
relevant to measure the DO level of aquaculture farms to

978-1-7281-3044-6/19/$31.00 ©2019 IEEE

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 03,2020 at 07:45:46 UTC from IEEE Xplore. Restrictions apply.
Fig. 1 The prototype of the floating sensor design

Fig. 3. Sensor node operation flowchart

Fig. 3 illustrates the flowchart of the developed sensor


Fig. 2 Block diagram node. Fig. 4 presents a photo of the buoy being deployed in a
concreted pond at BFAR --- ITSO.

A. Data Acquisition

This study focuses on the determination of the inherent


relationships that exist between the different physical
parameters of water. Fig. 2 illustrates the flowchart of the
developed sensor node. In coordination with the Bureau of
Fisheries and Aquatic Resources (BFAR) – Batangas, a three
weeks’ worth of dataset from a fish-breeding facility situated
at Brgy. Ambulong, Batangas was gathered. The floating
device (buoy) equipped with multiple sensors to record various
pondwater parameters is developed for data gathering. The
time when these parameters were measured was also taken into
consideration. Fig. 4 Buoy gathering parameters reading of a pond

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 03,2020 at 07:45:46 UTC from IEEE Xplore. Restrictions apply.
B. Application of Various Machine Learning Algorithms to
Data

Tested by other relevant literatures, the researchers


utilized key algorithms to create a predictive model that can
quantify levels of DO using other known physical pondwater
parameters. These algorithms are: (1) Random Forest
Regression, (2) Decision Tree Regression and (3) Multi-
layer Perceptron.

C. Data Resampling and Parameter Selection

The dataset gathered is then filtered using various


methods to rid it of errors, fragmented data, garbled sensor
inputs, etc. The filtering accepts only complete data, which
is only achieved when all sensors record their measured
parameters, and non-erroneous values.
Fig. 5 Buoy, floating sensor design
The filtered dataset is resampled into a minute, 5-minute,
10-minute, 30-minute and 60-minute intervals. This is done
to identify which interpolation would produce the best
model, as evaluated. III. EXPERIMENTS AND RESULTS
The sampled datasets were also utilized to quantify how
relevant each considered parameter is, to the predictive A. Nature of Filtered Dataset
model built. By using Feature Importance, the relative
importance of each parameters to DO level is known.
Table I and Table II illustrate the statistical description of
the filtered data set with the following parameters: Dissolved
D. Model Evaluation
Oxygen (DO) level in mg/L; Electrical Conductivity (EC) in
µS/m; Total Dissolved Solids (TDS) in ppm; Salinity (SAL)
Several predictive models were built based on various in psu; Specific Gravity (SG); Turbidity (TURB) in NTU;
combination of pondwater parameters. These models were Temperature (Temp) in °C; pH (PH) Level.
evaluated using the following criterion: Coefficient of The gathered dataset has a sample size of 1,019,189. The
Determination (R Square), Mean Absolute Error (MAE), DO level ranges from 3.00 mg/L to 37.4 mg/L. Electrical
Mean Squared Error (MSE), and Root Mean Squared Error conductivity has the highest deviation with ~133 µS/m. Total
(RMSE). dissolved solids has a mean of ~134 ppm, and ranges from
Higher level of R Square suggests that there is less error 17ppm to 450ppm. Salinity of the pondwater ranges from 0
or unexplained variance and therefore, better prediction and psu to 0.4 psu. The specific gravity is constant at 1. The
more precise DO level [17]. turbidity ranges from -2289.0 NTU to 37.4 NTU. Temperature
MAE evaluates how huge errors affects the accuracy of the ranges from 25.6 to 32.2. The pH of pondwater is about 6.59
model built. MSE is kept at minimum to ensure that the to 10.8.
predicted DO level is close to the actual DO level. Lower level
of RMSE is desired to avoid large errors between the predicted
B. Evaluation of Parameter Relevance to DO
and actual levels of DO.

E. Design of a Cost-Effective DO Meter Using key algorithms, features were selected based on
Feature Importance, a method that uses algorithms of Decision
Tree Regression (DTR) and Random Forest (RFR).
The whole process of the meter’s development resulted to a
For most of the evaluation made using various samples
handheld device shown in Fig.5. The parts as seen on the
from different time intervals, the key parameters which
image are labeled. The receptacle is where the sensors are
yielded notable levels of coefficients are as follows: time, pH,
contained for protection and for easy handling of the meter.
temp, EC, TDS and SAL.
The chassis hold inside the circuits and the supply of the
device. The LCD display shows the value of the predicted
C. Model Evaluation
dissolved oxygen level.

Knowing the degree of how each parameter affect DO


level, various predictive models are created utilizing RFR,
DTR, and MLP, with different combinations of parameters as
input. These models are evaluated based on R Square, MAE,
MSE, RMSE.

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 03,2020 at 07:45:46 UTC from IEEE Xplore. Restrictions apply.
For the ease of discussion, evaluations using dataset from 5 12:18:50 18.21 20.01 8.996
every minute, every 10 minutes and every hour time intervals 6 12:26:50 18.21 20.34 10.472
are shown. 7 13:21 18.24 17.85 2.185
8 13:50:31 18.24 17.98 1.446
TABLE I. MODEL EVALUATION 9 14:42:48 19.5 20.03 2.646
10 14:55:12 15.1 14.89 1.410
Metrics R Square MAE MSE RMSE
AVERAGE: 8.219
RFR 0.992 0.322 0.360 0.600
Every DTR 0.984 0.386 0.715 0.846
Minute
MLP 0.452 3.869 23.864 4.885 TABLE III. DEVICE EVALUATION – RANDOM FOREST

Every RFR 0.984 0.570 0.696 0.834 RANDOM FOREST REGRESSION ALGORITHM
10
DTR 0.967 0.695 1.456 1.207
Minute Tri
Actual DO
MLP 0.196 4.826 34.976 5.914 al Predicted Percent
Time (Atlas
RFR 0.889 1.629 4.686 2.165
No DO Error
. Scientific)
Every DTR 0.656 2.424 14.487 3.806
Hour 1 11:10:31 15.83 16.55 4.350
MLP -129.419 69.738 5495.280 74.130 2 13:36:03 19.29 19.26 0.156
3 14:51:38 19.68 20.98 6.196
It can be inferred from Table I that the predictive model 4 14:56:42 19.68 19.22 2.393
built using every minute dataset with top 6 parameters as 5 15:12:20 21.24 21.19 0.236
input, is superior with highest R Square of 0.992, and least 6 15:21:21 21.21 22.36 5.143
MAE, MSE and RMSE of values 0.322, 0.360, and 0.600
7 15:23:20 21.24 21.39 0.701
respectively. Moreover, in all evaluations, MLP performed
inferior having least R square and high levels of error. Based 8 15:25:21 18.5 18.27 1.259
from the table, utilizing RFR created desirable predictive 9 15:28:21 21.17 21.97 3.641
models with high levels of R Square and low levels of MAE, 10 15:29:47 18.5 18.13 2.041
MSE, and RMSE. The highest R Square (0.98401) is obtained AVERAGE: 2.612
by a model built using top 6 parameters as input. Similar
observations were seen in every hour, RFR built a desirable
As can be seen in the Table II and III, the average percent
predictive model with great levels of R Square and small
errors between the predicted DO level and measured level are
levels of MAE, MSE, and RMSE. Likewise, this evaluation is
8.22% and 2.61%, using DTR and RFR algorithms
obtained by a model built using top 6 parameters as input.
respectively. From this, it can be inferred that the most
From these simulations, it can be inferred that the top
effective model uploaded into the device is built using
performing models are built from Per Minute Sampling, using
Random Forest Regression algorithm, based on top 6
Random Forest and Decision Tree algorithms based on the
parameters. In summary, the metrics of the testing done is
following Top 6 Key Parameters: (1) Time when data was
shown in the Table IV.
taken, (2) pH, (3) Temperature, (4) Electric Conductivity, (5)
Total Dissolved Solids, and (6) Salinity.
With this, the researchers tested how well these models TABLE IV. DEVICE EVALUATION SUMMARY
perform in actual setting. These models were uploaded to an
R-pi microcontroller for the handheld device developed by the Metric RFR DTR
researchers. These models were tested its prediction schemes Coefficient of Determination 0.831 0.263
on the same pond where data was gathered. Mean Absolute Error 0.526 1.396
Tables II and III show the parameter measurements during Mean Squared Error 0.460 2.901
the testing, the predicted DO and the actual DO.
Root Mean Squared Error 0.678 1.703
TABLE II. DEVICE EVALUATION – DECISION TREE

DECISION TREE REGRESSION ALGORITHM


D. Statistical Analysis
Actual DO
Trial Predicted Percent To test whether the difference between these percent errors
Time (Atlas
No. DO Error is significant or not, statistical analysis should be performed.
Scientific)
1 11:01 17.84 15.95 11.850 The researchers used equal variance t-test since the sample
2 12:09 18.21 14.84 22.709
size between observed values are equal.
The following hypotheses were established:
3 12:11 18.21 16.67 9.238
HO: There is no significant difference between the
4 12:15:50 18.21 16.37 11.240
percentage errors of predicted and measured DO level, using

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 03,2020 at 07:45:46 UTC from IEEE Xplore. Restrictions apply.
predictive model built through Random Forest Regression sensing device which gathers measurements of the defined
(RFR) algorithm and Decision Tree Regression (DTR) water parameters (temperature, pH, and conductivity). DO
HA: There is a significant difference between the percentage Sense , the device developed, is a handheld device consisting
errors of predicted and measured DO level, using predictive of sensors for each of the highest-ranking parameters with
model built through Random Forest Regression (RFR) respect to their relationship to DO: temperature, pH, and
algorithm and Decision Tree Regression (DTR). Table V turbidity. In comparison [13] with Atlas Scientific’s DO
shows the static obtained using t-test Sensor, the device was able determine the value of DO with
only 2.61% error. Also, several formulated predictive models
TABLE V. TABLE 5 T-TEST STATISTICS COEFFICENT were processed and analyzed according to pertinent
parameters and out these models, the Random Forest
Statistic Coefficient
Regression algorithm was found to be the most efficient [18]
Mean diff. 5.607
and applicable method based on the results.
SE 2.210 For the future works of this study, it is recommended to add
t value 2.537 more sensors for the analysis of the correlation of the
variations in the chemical properties of water to the dissolved
df 18
oxygen levels.
two-tailed p 0.021
ACKNOWLEDGMENT
By performing Equal Variance t-test, with degrees of The authors would like to express their gratitude to the
freedom (df) = 18, the computed two tailed p-value is Bureau of Fisheries and Aquatic Resources (BFAR) of Region
0.020638. Since p<0.05, the researchers reject the null IV-A, Philippines, for support on the experiments and
hypothesis and accepts the alternative hypothesis. validation activities and to the Technological University of the
Hence, there is a significant difference between the means Philippines (TUP) for the assistance.
of percentage errors of predicted and measured DO level,
using predictive model built through Random Forest
REFERENCES
Regression (RFR) algorithm and Decision Tree Regression
(DTR). This suggests that the predictive model built using
[1] K. Singh, A. Basant, A. Malik, G. Jain,, “Artificial neural network
Random Forest should be adapted. modeling of the river water quality-A case study,” Ecological
Modelling 220, pp. 888–895, 2009 .
E. Device Comparison with Atlas Scientific DO Sensor [2] W. Wei, D. Changhui, L. Xiangjun, G. Jun, “Soft-sensor Software
Design of Dissolved Oxygen in Aquaculture,” 2017.
[3] D. Mulkerrins, A. D. W. Dobson, and E. Colleran, “Parameters
Table VI summarizes the comparison between DO Sense, affecting biological phosphate removal from wastewaters,” Environ.
the handheld device developed, and Atlas Scientific DO Int., vol. 30, no. 2, pp. 249–259, Apr. 2004.
[4] S. W. H. Van Hulle, H. J. P. Vandeweyer, B. D. Meesschaert, P. A.
Sensor. Vanrolleghem, P. Dejans, and A. Dumoulin, “Engineering aspects and
practical application of autotrophic nitrogen removal from nitrogen rich
TABLE VI. DO-SENSE VS ATLAS SCIENTIFIC DO SENSOR streams,” Chem. Eng. J., vol. 162, no. 1, pp. 1–20, Aug. 2010.
[5] Kalff, J., Limnology: Inland Water Ecosystems. Prentice-Hall, Upper
Atlas Scientific Saddle River, NJ., 2002.
DO Sensor [6] D. Willemsen, D., “Lights and Buoys Sailing Issues”, [Online].
Device DO sense with Available: http://www.sailingissues.com/navcourse9.html, 2018.
peripherals [7] The Atlas Scientific Website. [Online]. https://www.atlas-
Dissolved scientific.com/_files/_datasheets/_circuit/do_EZO_ datasheet.pdf.
Nature Multiparameter Oxygen 2018.
[8] The Atlas Scientific Website. [Online].https://www.atlas-
DO: 3.067 – 37.43 g/mL scientific.com/_files/_datasheets/_circuit/ec_EZO_datasheet.pdf .2019
pH: 0 – 14 [9] The DF Robot Website. [Online]
https://www.dfrobot.com/wiki/index.php/Industrial_pH_electrode(SU:
Temperature: -50ºC – 125ºC FIT0348).2017
Measuring DO: 0 -100
Electrical Conductivity: 5 – [10] The DF Robot Website. [Online].
Range g/mL
200k uS/cm https://www.dfrobot.com/wiki/index.php/Waterproof_DS18B20_Digit
al_Temperature_Sensor_(SKU:DFR0198).2017.
Total Dissolved Oxygen – ppm
[11] J. Brownlee, “Gentle Introduction to Predictive Modeling. Retrieved
Salinity – psu from Machine Learning Mastery”,
https://machinelearningmastery.com/gentle-introduction-to-predictive-
Reading
1 to 2 minutes 1 minute modeling/.
Stabilization n
[12] T. Chakravorty, How Machine Learning Works: An Overview.
Cost Php 23,867.00 Php 19,000.00 Retrieved from TheNewStack: https://thenewstack.io/how-machine-
learning-works-an-overview/2015, October 2016.
[13] C. H. Zang, Comparison of Relationships Between pH, Dissolved
Oxygen and Chlorophyll a for Aquaculture and Non-aquaculture
IV. CONCLUSIONS Waters. Water, Air, & Soil Pollution, 157-174, 2011.
[14] M. Bruckner, “The Winkler Method - Measuring Dissolved Oxygen”,
Considering the study’s findings, the researchers were Microbial Life Educational Resources.
successful in developing a sensor node equipped with various [15] Y. Chen, H. Yu, Y. Cheng, Q. Cheng, D. Li , “A hybrid intelligent
method for three-dimensional short-term prediction of dissolved

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 03,2020 at 07:45:46 UTC from IEEE Xplore. Restrictions apply.
oxygen content in aquaculture”, PLoS ONE 13(2): e0192456.
https://doi.org/10.1371/journal.pone.0192456, 2018.
[16] Fondriest Environmental, Inc. “Dissolved Oxygen.” Fundamentals of
Environmental Measurements, November 2013.
[17] Zhong Xiao, Lingxi Peng, Yi Chen, Haohuai Liu, Jiaqing Wang, and
Yangang Nie, “The Dissolved Oxygen Prediction Method Based on
Neural Network,” Complexity, vol. 2017, Article ID 4967870, 6 pages,
2017.
[18] J. K. Jaiswal and R. Samikannu, "Application of Random Forest
Algorithm on Feature Subset Selection and Classification and
Regression," 2017 World Congress on Computing and Communication
Technologies (WCCCT), Tiruchirappalli, 2017, pp. 65-68.

Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 03,2020 at 07:45:46 UTC from IEEE Xplore. Restrictions apply.

You might also like