0% found this document useful (0 votes)
161 views13 pages

IoT-Based Smart Biofloc Monitoring System for Fish Farming Using Machine Learning

ai

Uploaded by

yharraz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
161 views13 pages

IoT-Based Smart Biofloc Monitoring System for Fish Farming Using Machine Learning

ai

Uploaded by

yharraz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Received 13 March 2024, accepted 28 March 2024, date of publication 3 April 2024, date of current version 26 June 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3384263

IoT-Based Smart Biofloc Monitoring System for


Fish Farming Using Machine Learning
MUHAMMAD ADEEL ABID 1 , MADIHA AMJAD 2 , KASHIF MUNIR 2 ,
HAFEEZ UR REHMAN SIDDIQUE 1 , AND ANCA DELIA JURCUT 3 , (Member, IEEE)
1 Institute
of Computer Science, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan 64200, Pakistan
2 Institute
of Information Technology, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan 64200, Pakistan
3 UCD School of Computer Science, University College Dublin, Dublin 4, D04 C1P1 Ireland

Corresponding authors: Anca Delia Jurcut ([email protected]) and Muhammad Adeel Abid ([email protected])
This work was supported in part by University College Dublin and in part by the Irish Research Council Project
CHIST-ERA-22-SPiDDS-07.

ABSTRACT Biofloc technology assists in increasing the sustainability of fish farming by reusing and
recycling waste water. However, its sophisticated operation makes it very sensitive to environmental
conditions. A slight disturbance in one or more parameters can lead to high fish mortality and loss. IoT
systems provide an efficient way of closely monitoring the biofloc to avoid catastrophe. The best aqua
conditions vary depending on the fish. Therefore, there is a strong need to explore ideal conditions for
different fishes. In this work, we have focused on Tilapi fish in the southern Punjab region to find the most
suitable parameters. We have developed an IoT solution for monitoring Biofloc and gathering data. We have
used low-cost sensors in our product to make it feasible for poor fish farmers. Multiple machine learning
algorithms such as decision trees, random forest, support vector machine, logistic regression, Gaussian naive
Bayes, XGBoost and ensemble learning are applied to the collected dataset to effectively predict mortality.
Our analysis exhibits that the random forest and XGBoost achieved 98% accuracy in estimating mortality.
The union of IoT, machine learning, and affordability positions our study at the forefront of advancing
sustainable aquaculture practices in southern Punjab, Pakistan.

INDEX TERMS IoT automation, Biofloc, machine learning, mortality of fish, prediction.

I. INTRODUCTION measurement, PH levels, nitrogen, dissolved oxygen, and


Malnutrition is becoming a more and more challenging ammonia amounts in water.
issue with the growing population. There is a major thrust In contrast to traditional farming, tank culture fish
towards improving crops and livestock to increase their farming [2] provides an opportunity to grow a large number
productivity. An increase in crop production is achieved of fish in a limited space. Figure 1 shows a Biofloc water
through gene mutation and improved insect killer production. tank. However, fish farming has a major challenge of
Meat production is increased through improved poultry and environmental impacts and excessive water requirements.
animal farming in an advanced research-oriented environ- Biofloc fish farming [3] is the latest technology for
ment. Fish is another significant source of meat [1] that low-cost and sustainable fish production. One acre pond fish
cannot be neglected. Gifted tilapia (gene mutation), shrimps, can be produced using a 16-diameter and 6 feet depth Biofloc
stinging catfish, dombra, rahu, malli, thella, and pangasius are tank. A large amount of fish production is made in a small
primarily produced in the south Punjab region of Pakistan. Biofloc tank, and feed cost is saved through using probiotics
Traditional methods like a simple pond and cage farming that convert fish waste and remaining fish feed into protein,
are adopted for fish production. Common problems in fish which is undoubtedly helpful in minimizing the cost of fish
farming include a large amount of space, water and feed feed [4]. A flow-in and flow-out pipe are installed to change
the water of the Biofloc tank. For oxygen, an oxygen pump
The associate editor coordinating the review of this manuscript and and other techniques are used to maintain the oxygen level
approving it for publication was Diego Oliva . of the Biofloc tank. About 10 liters (gradually increasing to
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
VOLUME 12, 2024 86333
M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

20 liters with the growth of fish) of water is needed for a fish


for proper growth in a Biofloc tank. Different Probiotics [5]
are available that convert fish waste into protein that can be
reused.
A kit is available to measure the Biofloc tank’s PH,
dissolved oxygen, solid waste, ammonia, and nitrogen water
level [6]. Manual work is required to measure these values.
A person is needed to control and check the Biofloc after short
intervals and make adjustments accordingly. Maintaining the
appropriate quality of water is very crucial in Biofloc fish
farming.

FIGURE 1. Biofloc Water tank.

Water quality is disturbed due to the disturbance/ change in


the PH level of water, a decrease of dissolved oxygen (DO),
an increase of solid waste, an increase in the ammonia level
FIGURE 2. Survey sample.
of the tank, and an increase in the nitrogen level of water [7].
Furthermore, if any of the pumps, such as the water or oxygen
pump, stop working or the probiotic’s recommended quantity fields like home automation, smart cities, smart manufac-
is not used. Water quality badly affects fish growth and may turing, industrial automation, Agricultural automation, and
lead to mortality of the fish on a large scale. wearable. IoT comprises communication technologies like
South Punjab is a backward area of Punjab province and RFID, ZIG-BEE, BARCODE, Bluetooth, NFC, WIFI, and
needs help with the availability of necessities of life like many sensors [11]. Biofloc fish farming is productive as
food, drinking water and health facilities, etc. According to well as sensitive. It needs to be concisely monitored, since
Mr. Arshad (Assistant Director, Fisheries Department Rahim a minor mistake may lead to the mortality of fish on a large
Yar Khan), about 14 Biofloc units are installed in district scale, and the setup may undergo loss. IoT can help with
Rahim Yar Khan. The life cycle starts by putting fish seed real-life problems, and it will be beneficial in solving this
in the month of February and harvest in November and problem as well [12]. For developing countries like Pakistan,
December.The water quality of Biofloc tanks is maintained a cost-effective local IoT-based biofloc monitoring solution
through the workforce, and some persons are deputed is required. The feed requirements and sensitivity of fish to
permanently to Biofloc tanks to check the water quality. different levels of nitrogen, ammonia, and other parameters
A survey on Biofloc owner’s problems is also performed, and vary for different species of fish, therefore, it is important
a survey sample is shown in Figure 2. The survey concluded to detect optimal values of different parameters for the local
that there is need of an automatic system is needed that fish and calibrate the system to alarm accordingly before the
can predict bad water quality before 1-2 hours of expected mortality of the fish. Therefore, in this work, we have focused
fish mortality based on PH, DO, solid waste, ammonia, and on designing a solution for southern Punjb region of Pakistan
nitrogen level. This is due to the fish farmers faced difficulty for Tilapi fish. In particular our contributions in this paper are
in managing the water quality parameters. The manpower is as following.
need to check the water quality parameter every time. In case • Solar powered low cost IoT-based automation solution
of disturbance in water quality parameters, mortality of the is designed and implemented.
fish is reported on a large scale. This will help minimize the • Extensive data related to water quality parameters have
mortality rate of fish in the Biofloc fish tank and beneficial for been observed and recorded for over 1.5 months (data
fish owners. It will also decrease the loss factor and increase collected in May, June 2023-considered most crucial
the fish production per annum. time period in fish growth life cycle).
IoT (Internet of Things) provides best-automated solutions • Multiple machine learning algorithms such as decision
to real-world problems [8], [9], [10]. IoT contributes to trees, random forest, support vector machine, logistic

86334 VOLUME 12, 2024


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

regression, Gaussian naive Bayes, and XGBoost have pump, raw salt solution, distilled water, and water pump for
been calibrated for the given data and compared to find controlling the environment. MQTT protocol was used for
the best model that provides highest accuracy in early this research, and a mobile app was also built to display
prediction of mortality depending upon water quality results.
parameters. Islam [16] proposed a prediction system based on IoT and
• correlation between different water quality parameters KNN. He targeted eight fish species: Silver Carp, Tilapia,
are identified. Pangas, Sing, Koi, Rui, Prawn, and Katla. They used Arduino
The challenges faced related to IoT,Energy efficiencies, data UNO and ethernet shields with the internet. Thingspeak
rate, outage probability and any other challenges of this server stores the sensor value so that KNN is applied to the
research are as follows. collected data for training and testing the proposed model.
• Minimum cost effective solution is needed and this is The proposed model predicts the worst conditions. The Mq7
handled by using minimum number of sensors so that sensor for observing CO (carbon monoxide) was also used
cost will be decrease. in the experimental setup. Mahmuda et al. [17] presented an
• The exact calibration of the different sensors according IoT-based water quality system for Biofloc water tanks based
to the environment is required and is achieved by on image processing. They used Raspberry Pi, a camera,
checking and measuring with different instruments. a chemical pump, a water pump, a temperature sensor,
• Due to the limitation of Thingspeak cloud server storage a humidity sensor, and a pH sensor. Furthermore, image
(8200/hour entries), 2 minutes interval of the data processing was used to measure the dissolved oxygen and
storage is selected. ammonia level. IFTTT was used in this research for sending
• The data collection for experiments is required for that emails. The main aim of this research is to predict water
period that is most crucial for fish growth/mortality and quality using low costs sensors. Arefin et al. [18] proposed
period selected for data collection is May and June. an IoT-based water quality management system based on a
It is the most cost effective solution for the sothern punjab regression tree. The results were displayed on the mobile
region. as it cost nearabout $60 while the cost of the fish seed app. Dissolved Oxygen, Nitrogen, pH, temperature, Nitrate,
cost (500 Pangasius) is $174. The union of IoT and machine Ammonia, and Carbon Dioxide are observed through sensors.
learning results in an effective system that can be used over Google Firebase cloud service was used for storing readings
time and can contribute in reducing the mortality of the fish obtained from sensors. The proposed system attained an
in a cost effective way. accuracy of 79%, which is quite acceptable. Rahid [19]
The rest of the paper is organized as follows. Section II provided a prediction of the water quality of Biofloc tanks
illustrates a literature review of related work done in the field using IoT and machine learning models for Bangladesh
of Biofloc automation. Section III depicts the experimental Biofloc fish farming. The author used Arduino UNO,
setup of the IoT solution for automating Biofloc. This section pH sensor, temperature sensor, TDS sensor, and Neural
describes the hardware, software, and methodology adopted network for training the proposed model. Three months of
for the completion of this research. Section IV presents results data are used for this purpose, and data contains Date, pH,
and finally we conclude our paper in Section V. temperature, TDS, NH3, and Floc quantity. Python 3.8, with
the support of tensor flow, Keras, Pandas, Matplot, Numpy,
II. LITERATURE REVIEW and Boxplot, was used for programming. The author obtained
There are a number of researchers who focused research 77.3 accuracy of the proposed model regarding the IoT
on Biofloc fish farming. Most of them researched and Biofloc solution. Blancaflor et al. [20] worked on remote
proposed solutions for their local Biofloc fish farming water quality management. They divided the solution into
industry. Most significant researchers’ work are discussed three layers. Application layer (mobile app), middleware
here. Goswami et al. [13] presented a complete description layer (API, cloud database), and physical layer (pH sensor,
of the equipment and things that are needed to build an temperature sensor, dissolved oxygen sensor, feed feeder,
ideal Biofloc System. The Authors also suggest a cost-saving heater, fan, motor). Ten respondents were involved in vali-
devices description to build a Biofloc water tank. Nikhita dating the results. He obtained non-functional and functional
Rosaline et al. [14] presented an IoT-based solution for mean ratio scores of 3.65 and 3.82, respectively regarding
Aquaculture monitoring. They used a temperature sensor, water quality management.
Ammonia sensor, Dissolved Oxygen sensor, Salinity sensor, Blancaflor et al. [21] assessed an automated IoT-based
water level sensor, Nodemcu ESP12E controller, Wi-Fi Biofloc water quality system for the growth and mortality
controller, and web server for sending SMS. The research of Litopenaeus vannamei. The author concluded that water
focuses on the growth of shrimp, fish, and other water quality had a 10% higher survival rate. He divided that
animals. Saha et al. [15] focus on the automation of BioFloc solution into three layers: Application layer, Middleware
using IoT. They used their respective sensors to focus on pH, layer, and physical layers. For the experimental purpose,
dissolved oxygen, temperature, and water level monitoring. he uses Arduino ATMEGA 2560, Wi-Fi routers, cloud
They used Arduino Mega as the main board and displayed services, pH sensor, temperature sensor, DO sensor, feeder,
all the results on LCD. The acid solution, base solution, air solar power, heater, fan, motor, etc. Ahmed et al. [22]
VOLUME 12, 2024 86335
M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

provided a real-time water quality system for the Biofloc TDS, DHT11, and temperature are used to accomplish the
water tank. They observed values and noted the readings task. Gas sensors like mq-7 and mq-135 sensors are used to
in a dataset with a 1-hour gap. They used an ESP32 IoT measure ammonia and carbon monoxide gas in the biofloc
Wi-Fi controller board, temperature sensor, pH sensor, TDS water tank [26]. PH sensor is used to detect the PH of the
sensor, water level sensor, servo motor, water pump, heater, water because PH of the water is the most sensitive and
more fantastic fan, oxygen pump, water filter, and LCD for dominant factor of the water quality system. TDS sensor
displaying results. They also observed that if the temperature identifies the Dissolved solids in the Biofloc water tank.
increases, then Dissolved oxygen decreases. Furthermore, DHT11 sensor is used to measure the external temperature
if pH increases, then ammonia rises, and if pH decreases, and humidity of the environment. The temperature sensor
then ammonia ion is converted into NH4+ (ammonia ion) and measures the temperature of the biofloc water tank as fish
OH- (hydroxyl ion). Ahammad et al. [6] suggested a feeding growth is greatly affected by poor temperature (too hot water
system for Biofloc technology with the support of a GSM or too cold). All these sensors are connected to Arduino UNO.
device. The proposed method is wholly based on the pH and Further, for power supply, solar with backup is used to
temperature values of the Biofloc water tank. The researchers power the sensors, boards, and others. This is done because
used Arduino, a servo motor (for feeding), LED (for display we need to continuously monitor the water quality of the
results), GSM, pH sensor, and temperature sensor. Biofloc water tank. If power fails, then our automation system
Goswami et al. [13] proposed an intelligent system for will be turned off, which is not bearable at any cost. So,
Biofloc fish tanks in Bangladesh. They used a pH sensor, a separate power supply that works on solar power in the
TDS sensor, Electrical conductivity, temperature sensor, daytime and can have 16 hours of backup time at night is
Recirculation Aquaculture System (RAS), and solar system used as well [27].
as a power source. An image processing technique was An ESP8266 NodeMCU WIFI microcontroller is con-
applied to calculate the weight of the fish. Thermal scanner nected to Arduino UNO because of the availability of the
app for scanning the fish and calculating the approximate internet everywhere nowadays. This WiFi module connects
weight. Prakosa et al. [23] focused on the acidity level the Arduino UNO and sensors to the Cloud to save values
monitoring system of Biofloc tank water. They used a pH periodically. All sensor values are collected and treated as
sensor, Arduino, Database, LCD, and Thermo hydrometer to raw data. Then preprocessing techniques are applied to raw
monitor the Biofloc acidity. Blancaflor et al. [24] focused data to clean the data. Feature extraction is performed on
on cost and profit comparison. The researcher explored the the data obtained after preprocessing. The obtained dataset
economic impact of solar power water quality management is further sliced into training and testing sets. Varioumachine
systems for Biofloc water tanks. The authors found that algorithms were trained using training data and compared.
ROI (Return on Investment) for the target investors and fish Further, different parameters are also identified that affect
farmers is 112% and 103.47%, respectively. He performed a the fish’s mortality much more than other parameters. The
market survey for Biofloc, and 83.3% of respondents thought accuracy of machine learning algorithms has been evaluated
that water quality plays a vital role in the survival and growth using different evaluation parameters with the help of testing
of the fish. data. Data visualization and early alarming systems are done
Due to the different levels of sensitivity and feed require- via Web applications and Android apps. Early prediction
ments of the various species of fish, it is necessary to analyze is made before the mortality of the fish. Complete details
and determine critical values of different water-related of hardware architecture, cloud server, and applied machine
parameters specific to particular fish. There is no significant learning on the sensors data are discussed below.
work related to automating biofloc in Pakistan, especially in
southern Punjab has been reported. Moreover, the previous
work on determining the impact of different values on A. HARDWARE ARCHITECTURE
mortality achieves less accuracy. Our research in this paper Water Quality parameters of the Biofloc fish tank are
focuses on these research gaps. collected through different sensors connected to the Arduino
UNO microcontroller board and NodeMCU ESP8266 WIFI
III. SMART BIOFLOC MONITORING SYSTEM board [28]. The sensors used in our system are the MQ-
We have built a custom IoT-based smart biofloc system. 7 sensor, MQ-135 sensor, TDS sensor, Turbidity sensor,
For this, we have used cheap sensors so that the system is DHT11 sensor, water temperature sensor, and DHT11 sensor.
affordable for poor fish farmers. We have used this system to A description of the microcontroller board and sensors is
collect the dataset specifically for fish that are local to the given below.
south Punjab region of Pakistan. We have used sensors to
monitor the quality of water.
Figure 3 describes the IoT automation [25] of the water 1) ARDUINO UNO
quality prediction system of the Biofloc tank and is initiated Arduino UNO is the most used microcontroller board
by capturing values via sensors from the Biofloc water tank. used in IoT projects, and it is based on the ATmega328P
Microcontroller boards and sensors like mq-7, mq-135,pH, microcontroller board. It contains six analog pins and

86336 VOLUME 12, 2024


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

FIGURE 3. Smart Biofloc Monitoring System.

14 analog pins. It works on 5V. Furthermore, it has a USB accurately measure TDS. Its measurement ranges from 0 to
connection, a reset button, and a power jack. 1000 PPM.

2) NODEMCU ESP8266 WIFI 6) TURBIDITY SENSOR


NodeMCU ESP8266 WIFI microcontroller is a small board A turbidity sensor is used for checking water quality.
with a WIFI module. It operates on 3.3V and is connected via It measures the light intensity scattered by suspended
micro-USB. It contains one analog pin and 11 digital pins. particles in water. The turbidity level of water increases with
the increase of TSS (total suspended solids). Its measurement
ranges from 0 to 4000 NTU.
3) MQ-7 SENSOR
MQ-7 sensor is used to record the CO gas emitted in 7) DHT11 SENSOR
the surroundings. It helps maintain the air quality. Its
DHT11 sensor is handy as it measures the temperature and
measurement ranges from 10-10,000 ppm. This sensor takes
humidity in the environment. The temperature ranges of the
5V for operation and is connected via an Analog pin to the
DHT11 sensor are −20◦ C – 60◦ C and humidity ranges 5 –
microcontroller board.
95% RH.

4) MQ-135 SENSOR 8) WATER TEMPERATURE SENSOR


MQ-135 sensor is used to check the air quality of the The water temperature sensor is used to measure the
environment [29]. MQ-135 is used to detect Ammonia, sulfur, temperature of the water. It measures the behavior and
Benzene, CO2, smoke, and other harmful gases. In this response of aquatic animals concerning temperature. The
research, this sensor is used to measure Ammonia gas. Its water temperature sensor ranges from −5◦ C – +50◦ C.
measurement ranges from 10-1000ppm (Hydrogen, smoke,
ammonia gas, and toluene). 9) PH SENSOR
PH sensor is used to calculate the value of the pH of
5) TDS SENSOR water [30]. It ranges from 0 to 14, where seven is considered
TDS stands for Total Dissolved Solid, which calcu- neutral/good drinking water.
lates the dissolved solids in water. TDS sensors are Figure 4 shows the practical implementation of proposed
widely used in aquaculture environments because they can system.

VOLUME 12, 2024 86337


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

TABLE 1. Data Collection Summary.

FIGURE 4. Practical Implementation.

B. CLOUD SERVER
Cloud servers like Thingspeak, Microsoft’s Azure, Google’s
Cloud, and Amazon’s AWS are available. For the proposed
the Thingspeak cloud server [32]. CO stands for Carbon
system, the Thingspeak cloud server is used as it is free and
monoxide present in water. Ammonia level is the value
supports 8200 values of storage per day in the free version.
obtained from the MQ135 sensor that measures the amount
The channel named ‘‘BioFloc Data Collection’’ with Channel
of ammonia present. Humidity and external temperature
ID ‘‘2081173’’ is created to store the sensor’s values. Sensors
of the environment are recorded via the DHT11 sensor as
are set up to take readings from the Biofloc water tank after
the Biofloc fishpond is also affected by the surrounding
2 minutes, and then these values are stored on the cloud server
environment [33]. Turbidity refers to the cleanliness of water
via NodeMCU ESP8266 WIFI module. There are several
and is collected via the turbidity sensor. TDS stands for Total
Cloud services [31] available for storing the sensor data, but
Dissolved Solid that is present in water and managed using a
the Thingspeak Server is used to accomplish the storage of
TDS sensor [34]. The pH of the water ranges from 1 to 14, and
sensors. Figure 5 shows the IDE of the Thingspeak cloud
7 is considered the most suitable water for drinking. pH value
server.
of the water of the Biofloc fish tank is collected via a pH
sensor.

D. DATA VISUALIZATION
This section illustrates the dataset in visual form collected
for this research. Overall, 18978 values against each sensor
are collected on the Thingspeak cloud server. After prepro-
cessing, 18342 values remain for further experiments, and 3%
values are omitted during preprocessing. Figure 6 shows the
dataset statistics after applying preprocessing.

FIGURE 5. Thingspeak cloud server IDE.

C. DATA COLLECTION
The first setup of the Biofloc water tank is installed to collect
datasets used for this study. Tilapia is the most common fish
in the south punjab region and is used for this research. Gas
sensors, PH sensors, TDS sensors, Temperature sensors, and
other supporting devices/microcontrollers are installed on the
Biofloc water tank to collect data [16], [22]. Arduino UNO
board is interfaced to the WIFI module to send data to a cloud
server. Data is collected for 1.5 months (13-May-2023 to 02- FIGURE 6. Statistics of dataset used for experiment purpose.
July-2023) with an interval of 2 minutes, and 19878 values
are collected against each sensor.
Table 1 shows the statistics of the dataset used to E. DATA PREPROCESSING
accomplish this research article. There is a collection of The dataset collected for experimenting contains numeric
values under different categories. Date & Time is the date values. Still, some missing values or outliers in the dataset
and time at which the value from the sensor is received by need to be sorted out before the experiments are done [35].

86338 VOLUME 12, 2024


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

Figure 7 Shows the complete steps that are included in ‘‘Yes’’ is converted to 1, and the ‘‘No’’ value is converted to 0.
preprocessing. Import Libraries: The library in any language Then Mortality column is converted to int data type. Scaling
contains the functions and methods that support different is the next step that is performed next. Table 3 shows the
operations in that language. The first step of preprocessing sample values of the dataset after preprocessing. ADASYN:
is importing libraries that are necessary to perform necessary ADASYN stands for Adaptive Synthetic Sampling, that is the
procedures related to preprocessing of the dataset. Without modification of SMOTE for dealing and adjusting the weights
importing libraries, we can’t complete the task whose for minority class [37]. It is beneficial as it deals with the
definition is defined in the specific library. Read Dataset: The minority class as equity with major class. This helps in better
next step is to load the dataset upon which experiments would training the proposed model and facilitates in obtaining the
be performed. Pandas’ library supports loading the dataset desired results.
on which investigations would be completed. Checking for
Missing Values: The library in any language contains the 1) MACHINE LEARNING ALGORITHMS
functions and methods that support different operations in Different numbers of machine learning algorithms are used
that language. The first step of preprocessing is importing for classification and clustering purposes. This study uses
libraries that are necessary to perform necessary procedures decision trees, random forests, support vector machines,
related to preprocessing of the dataset. Without importing logistic regression, Gaussian Naïve Bayes and XGBoost.
libraries, we can’t complete the task whose definition is Different train-test splits of the dataset are used to train
defined in the specific library. Checking of Categorical tmachine learning models [38]. Description of machine
Data: The next step is checking the categorical column learning classifiers along with the calculation formula is
whether they are present in the dataset. If some flat column given below.
is current, then encoding is performed. All the columns
in the dataset are numeric are contains values that are a: DECISION TREE
not categorized except mortality, which is organized and A decision tree is a famous machine-learning algorithm
needs to be determined based on the importance of the that makes decisions based on a group of features in a
other parameters. Standardize the Data: Standardizing the
hierarchy [39], [40]. It undergoes a recursive process of
data is the next step in preprocessing, in which scaling is
splitting the data into different subsets and creating a tree
performed so that all dataset features are well-tuned and
model. On each node of the decision tree, the decision is
the accuracy of machine learning algorithms is not affected
made based on some specific feature and value, creating a
due to different columns contain different ranges of values
model of the tree. The decision defines which side of the
and need to be standardized before splitting the data. In the
tree would be followed. The process under specific criteria
case of the dataset used for this research, columns other
continuous unit maximum depth is achieved. A decision tree
than ‘‘created_at,’’ ‘‘entry_id,’’ and ‘‘Mortality’’ are numeric,
is the simplest structure and can be understood and visualized
and scaling is applied to the columns in preprocessing. Data
easily. They can handle both numerical and categorical
Splitting: Data splitting of the dataset is significant because it
values and are robust against outliers. They used appropriate
is needed to train the machine learning algorithms, and then
strategies to address missing values with imputation. When
testing data is used to evaluate the results. The dataset under
the tree becomes deep, it can be overfit. To solve this issue,
observation is split into different ratios like 60%-40%, 70%-
Random forests combine different trees to reduce overfitting
30%, 80%-20% and 90%-10% is used to test the machine
and improve performance.
learning algorithms [36]. Implementation of Preprocessing:
All the experiments on the dataset are performed on a Core
i7 10th generation laptop with Windows 10 as the operating b: RANDOM FOREST
System. Jupiter Notebook and Python are used as IDE and Random Forest is an ensemble technique that combines many
language to accomplish the experiments. decision trees [41]. Different Bootstrap samples are used to
Table 2 shows the the sample values of the dataset. Here train ml algorithms. Subsampling of datasets is done to get
some deals are missing in columns due to the calibration a bootstrap sample where the sample size is the same as the
time of the sensors, and Water Temperature Column consists training dataset size. Random Forest can be calculated as in
of ‘‘\r\n’’ and some special characters that need to Equation 1. The Random Forest prediction equation is given
be cleaned. So, the first null values from all the datasets by [42]:
are dropped. Then each column is checked thoroughly, and
if some special characters are there and were removed. P = majority (T1 (y), T2 (y), T3 (y), . . . , Tn (y)) (1)
Mainly Water Temperature column is focused as it contains
‘‘\r\n’’ in each cell that needs to be removed. After where,
removing and cleaning the data, columns name CO, Ammo- • p is the calculated prediction based on majority decision
nia, Humidity, Turbidity, External Temperature, TDS, pH, trees.
and Water Temperature are converted to float data type. The • T1(y), T2(y), T3(y) . . . , Tn(y) shows the number of
mortality column contains values of ‘‘Yes’’ and ‘‘No.’’ The decision trees participating in the prediction process.

VOLUME 12, 2024 86339


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

FIGURE 7. Preprocessing Steps.

TABLE 2. Sample Values of dataset before preprocessing.

TABLE 3. Sample Values of dataset after preprocessing.

c: SUPPORT VECTOR MACHINE(SVM) • L represents the maximum value of the curve.


SVM (Support Vector Machine) is a famous classifica- • m represents the curve’s steepness.
tion and pattern recognition technique [43]. It controls
high-dimensional datasets by calculating a hyperplane that e: GAUSSIAN NAÏVE BAYES
maximizes the separation margin between classes, thus Gaussian Naïve Bayes is based on the Bayes theorem and
minimizing the error in the category. However, when faced changes pretty differently from other algorithms as all the
with overlapped data, SVM’s performance is highly affected features in this classifier are independent [45]. It is used
in classification. SVM is a supervised machine learning for classification purposes for objects having customarily
model designed to solve binary classification problems. distributed data. Due to its characteristics, it is known as
In most situations, SVM proves to be fruitful and produces Gaussian Naïve Bayes. Equation 3 and 4 shows the formula
better accuracy. to calculate the Gaussian Naïve Bayes [46].
P(c) · P(x|c)
d: LOGISTIC REGRESSION P(c|x) = (3)
P(x)
Logistic Regression works with several independent variables P(c|x) ∝ P(c) · P(x|c) (4)
to produce separate values. It calculates the probability of
each class present in the dataset. Due to this, it is considered where,
a good classifier that is used for categorical data. It finds and • P(c | x) represents the target class’s posterior probability
works on the association between dependent and independent • P(c) represents the class’s prior probability.
variables. Equation 2 illustrates the Logistic Regression • P(x | c) represents the predictor class’s posterior
formula [44]. probability.
• P(x) represents predictor’s prior probability.
1
P(Y = 1) = (2)
1 + e−(m(x−v0 ) f: XGBOOST
where, XGBoost or Extreme Gradient Boosting is an efficient
• e represents Euler Number. gradient boosting algorithm used as a machine learning
• vo represents x-value of sigmoid midpoint. algorithm [47]. It can handle high and complex dimensional

86340 VOLUME 12, 2024


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

datasets and perform classification and regression. XGBoost • False Positive = The algorithm model predicted no
combines multiple weak models and performs an iterative mortality and the real value also shows mortality.
process where each model corrects the mistakes made by • False Negative = The algorithm model predicted
its previous model. This certainly improves the accuracy of mortality and the real value also shows no mortality.
the proposed model and makes it a more robust model as
compared to others. The main strength of XGBoost is dealing IV. RESULTS AND DISCUSSION
with numerical and categorical features and missing values in This section illustrates the results of the experiment and
the dataset. It also uses a regularization technique to avoid evaluation parameters results. IoT sensors, along with the
overfitting in the training process. It can handle large, big micro controller boards, are installed on the Biofloc water
datasets and perform fast activity and predictions. tank. Thingspeak cloud server saves the data sent from
the sensors via the internet. The dataset is imbalanced,
F. EVALUATION PARAMETERS therefore, a sampling technique named ADASYN balances
Testing data is applied to check whether it is correctly the dataset. Machine Learning algorithms random forest,
trained. For testing purposes, many evaluation parameters decision tree, Support Vector Machine (SVM), logistic
are available. We evaluated precision, recall, and F1 score regression, XGBoost, and Gaussian Naïve Bayes are trained
for various machine learning algorithms to select the most for improved accuracy. Ensemeble learning for all the
suitable for predicting fish mortality. above mentioned algorithms was also performed. Evaluation
parameters precision, recall, f1 score, and accuracy are
1) PRECISION used to evaluate the accuracy of machine learning models.
Precision [48] is used to measure and calculate the correct- ADASYN is performed on the dataset to deal with the
ness of a model. The formula used to calculate the precision unbalancing of the class. Machine learning models are trained
is as below [49]. and tested with the different percentages using 60-40, 70-30,
80-20, and 90-10 train-test split of the dataset to check the
True Positives
Precision = (5) performance of the machine learning algorithms at different
True Positives + False Positives proportions of the training dataset. Table 4 shows the results
2) RECALL of machine learning algorithms on the dataset’s 60-40 train
Recall [50] deals with the completeness of the classifiers. test split. Random forest, decision tree, and XGBoost offer
Recall calculated as the total no. of true positives divided by better accuracy of 97% compared to others. The support
the addition of no. of true positives and no. of false negatives. vector machine shows a marginally low accuracy of 94%
The formula to calculate the Recall is given below [49]. after the random forest, decision tree, and XGBoost. Gaussian
naïve Bayes offers lower accuracy, precision, recall, and f1
True Positives
Recall = (6) score values than other machine learning classifiers [53].
True Positives + False Negatives
TABLE 4. Results of machine learning classifiers using 60–40 train-test
3) F1 SCORE split of dataset.
The F1 score is another evaluation parameter calculated from
Recall and Precision value. It takes the Precision and Recall
value and then finds the harmonic mean between them. Its
values range from 0 to 1. The formula to calculate the F1 score
is described below [51].
2 · Precision · Recall
F1 Score = (7)
Precision + Recall
4) ACCURACY
Accuracy is used to measure and calculate the correctness
of the target classifiers. Its values range from 0 to 1.
Equation 3 illustrates the formula for the calculation of
accuracy. Formula of accuracy for Binary Classification is
Table 5 illustrates the results of machine learning clas-
given below [52].
sifiers on a 70-30 train-test split of the dataset. Random
True Positives + True Negatives forest shows a significant accuracy of 98% among all the
Accuracy = (8)
Total Number of Samples machine learning classifiers. The random forest offers a better
where, performance in terms of precision, recall, and f1 score as
• True Positive = The algorithm predicted mortality and compared to others. XGBoost secured 2nd place in accuracy
real value also shows mortality. percentage among the machine learning classifiers. Support
• True Negative = The algorithm model predicted no vector machine and logistic regression show marginally lower
mortality and the real value also shows no mortality. accuracy than XGBoost. Gaussian naïve Bayes shows the

VOLUME 12, 2024 86341


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

FIGURE 8. Correlation matrix between water quality parameters and mortality.

weakest results in accuracy, precision, recall, and f1 score 97% accuracy. A marginal low accuracy is observed in the y
among all the classifiers. decision tree and support vector machine classifier. Logistic
regression stands second last among all the machine learning
TABLE 5. Results of machine learning classifiers using 70–30 train-test classifiers regarding accuracy. Gaussian naïve Bayes shows
split of dataset. the lowest 90% accuracy for the dataset’s 80-20 train-test
split.

TABLE 6. Results of machine learning classifiers using 80–20 train-test


split of dataset.

Table 6 shows the evaluation parameters results of


machine learning classifiers on the dataset’s 80-20 train-
test split. Random forest and XGBoost outperformed with

86342 VOLUME 12, 2024


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

Table 7 shows the results of machine learning classifiers


for the dataset’s 90-10 train-test break. Random forest and
XGBoost offer an extraordinary 97% accuracy among all
the machine learning classifiers. The decision tree classifier
shows marginally low accuracy compared to random forest
and XGBoost. Gaussian naïve Bayes offers 89% accuracy,
recorded as minimum accuracy among all the machine
learning classifiers.

TABLE 7. Results of machine learning classifiers using 90–10 train-test


split of dataset.

FIGURE 9. Accuracy of machine learning classifiers on different train-test


splits of the dataset.

the fish strongly depends on the value of TDS and pH and


slightly lower association with the ammonia level of the
water. Furthermore, mortality is weakly associated with the
CO, turbidity, and water temperature of the Biofloc water
tank. A new fish farmer can adopt the proposed IoT based
solution that focuses on Ammonia level, TDS and pH level
that are considered most important factors in the fish growth.
Random forest and XGBoost machine learning classifiers The sensors installed on the Biofloc water tank continuously
show better accuracy than others in all four train-test splits taking readings. It can help a fish farmer by predicting the
of the dataset. The percentage of train-test breaks that are worst/bad water conditions before 1-2 hours that can lead to
usually considered good is 70-30 and 80-20 because both the mortality of the fish.
contain a significant amount of training data to train a model
and have enough datasets to perform and test the accuracy of V. CONCLUSION AND FUTURE WORK
a model. Biofloc with tank culture is an attractive solution to increase
Figure 9. shows the accuracy of machine learning classi- fish production in a limited space. Recycling and reusing of
fiers using different train-test splits of the dataset. XGBoost water in Biofloc results in decreased environmental impact
offers a consistent accuracy of 97% in all the train-test and water consumption. Besides the benefit of Biofloc, due
divisions of the dataset. The random forest also shows better to its sophisticated operation, Biofloc is very sensitive to
accuracy and remains equal to XGBoost. The decision tree the changes in the quality of water. A slight change in the
shows marginally low accuracy compared to random forest quality of water may lead to the mortality of fish on a large
and XGBoost. scale. Moreover, these parameters depends on the type of
Gaussian Naive Bayes offers the most insufficient accuracy fish in the water tank. Therefore, it is important to design
in all dataset train-tests. Evaluating which water quality an IoT Solution along with the machine learning algorithm
parameter affects the fish’s mortality is also essential. for the specific fish to warn the alarming situation. In this
Because there are various water quality parameters, some are article, we have designed and implemented an IoT system
sensitive while others are not sensitive enough to be cared for an effective continuous monitoring of Biofloc. Using
for. The correlation matrix is essential as it clearly shows the the system, data is collected for Tilapia fish in southern
correlation between different parameters and gives a better Punjab, Pakistan. Multiple machine learning algorithms such
idea of dependent variables. Figure 8 shows the correlation as decision trees, random forest, support vector machine,
[54] between water quality parameters and the mortality of logistic regression, Gaussian Naïve Bayes, XGBoost and
the fish. Carbon monoxide has a strong association with the Ensemble learning are applied. Evaluation parameters named
Ammonia level of water and is correlated with the TDS of precision, recall, f1 score, and accuracy are evaluated to
the water as well. Ammonia is strongly associated with CO estimate mortality. Among all the models, random forest
and TDS, pH, and water temperature, and it also impacts and XGBoost show better accuracy of up to (98%). This
fish mortality. Humidity in the environment has a loose significance of this accuracy is that the system is 98 percent
relationship with external temperature. Turbidity of the water accurate in identifying the conditions of fish mortality and
relates to water temperature and has a weak association this is effective in building an early alarming system for fish
with pH, TDS, and mortality. TDS is strongly associated farmers before the mortality of fish occurs. In the future,
with ammonia, pH, mortality, and CO. pH of the water is we intend to use our system to collect data for different
highly correlated with TDS and mortality. The mortality of species of fish and train model for their efficient monitoring.

VOLUME 12, 2024 86343


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

We will also intend to evaluate ensemble learning to train the [21] E. B. Blancaflor and M. Baccay, ‘‘Assessment of an automated IoT-biofloc
model and get better results. water quality management system in the litopenaeus vannamei’s mortality
and growth rate,’’ Automatika, vol. 63, no. 2, pp. 259–274, Apr. 2022.
[22] I. Ahamed and A. Ahmed, ‘‘Design of smart biofloc for real-time water
REFERENCES quality management system,’’ in Proc. 2nd Int. Conf. Robot., Electr. Signal
Process. Techn. (ICREST), Jan. 2021, pp. 298–302.
[1] D. J. McClements and L. Grossmann, ‘‘The science of plant-based
foods: Constructing next-generation meat, fish, milk, and EGG analogs,’’ [23] J. A. Prakosa, ‘‘Development of monitoring techniques and validation of
Comprehensive Rev. Food Sci. Food Saf., vol. 20, no. 4, pp. 4049–4100, the acidity level of biofloc pond water for optimizing tilapia aquaculture,’’
Jul. 2021. IOP Conf. Ser., Earth Environ. Sci., vol. 1017, no. 1, Apr. 2022,
[2] Y. I. Chu, C. M. Wang, J. C. Park, and P. F. Lader, ‘‘Review of cage and Art. no. 012006.
containment tank designs for offshore fish farming,’’ Aquaculture, vol. 519, [24] E. Blancaflor and M. Baccay, ‘‘Economic & operational impact analysis of
Mar. 2020, Art. no. 734928. a solar powered remote water quality management system designed for an
[3] M. H. Khanjani, M. Sharifinia, and S. Hajirezaee, ‘‘Recent progress indoor biofloc aquaculture setup,’’ in Proc. 5th Int. Conf. E-Soc., E-Educ.
towards the application of biofloc technology for tilapia farming,’’ E-Technol., Aug. 2021, pp. 81–86.
Aquaculture, vol. 552, Apr. 2022, Art. no. 738021. [25] W.-S. Kim, W.-S. Lee, and Y.-J. Kim, ‘‘A review of the applications of
[4] M. H. Khanjani and M. Sharifinia, ‘‘Biofloc technology as a promising the Internet of Things (IoT) for agricultural automation,’’ J. Biosyst. Eng.,
tool to improve aquaculture production,’’ Rev. Aquaculture, vol. 12, no. 3, vol. 45, no. 4, pp. 385–400, Dec. 2020.
pp. 1836–1850, Aug. 2020. [26] K. B. K. Sai, S. Mukherjee, and H. P. Sultana, ‘‘Low cost IoT based air
[5] A. Zabidi, F. M. Yusoff, N. Amin, N. J. M. Yaminudin, P. Puvanasundram, quality monitoring setup using Arduino and MQ series sensors with dataset
and M. M. A. Karim, ‘‘Effects of probiotics on growth, survival, water analysis,’’ Proc. Comput. Sci., vol. 165, pp. 322–327, Jan. 2019.
quality and disease resistance of red hybrid tilapia (Oreochromis spp.) [27] H. Agrawal, R. Dhall, K. S. S. Iyer, and V. Chetlapalli, ‘‘An improved
fingerlings in a biofloc system,’’ Animals, vol. 11, no. 12, p. 3514, energy efficient system for IoT enabled precision agriculture,’’ J. Ambient
Dec. 2021. Intell. Humanized Comput., vol. 11, no. 6, pp. 2337–2348, Jun. 2020.
[6] M. B. Ahammed, S. Sultana, A. Sarkar, and A. Momin, ‘‘pH and [28] D. R. Prapti, A. R. M. Shariff, H. C. Man, N. M. Ramli, T. Perumal, and
temperature monitoring with a GSM-based auto feeding system of a M. Shariff, ‘‘Internet of Things (IoT)-based aquaculture: An overview of
biofloc technology,’’ Int. J. Sci. Eng. Res., vol. 13, no. 4, pp. 270–274, IoT application on water quality monitoring,’’ Rev. Aquaculture, vol. 14,
2022. [Online]. Available: http://www.ijser.org no. 2, pp. 979–992, Mar. 2022.
[7] E. O. Ogello, N. O. Outa, K. O. Obiero, D. N. Kyule, and J. M. Munguti, [29] K. B. K. Sai, S. R. Subbareddy, and A. K. Luhach, ‘‘IoT based air
‘‘The prospects of biofloc technology (BFT) for sustainable aquaculture quality monitoring system using MQ135 and MQ7 with machine learning
development,’’ Sci. Afr., vol. 14, Nov. 2021, Art. no. e01053. analysis,’’ Scalable Comput., Pract. Exper., vol. 20, no. 4, pp. 599–606,
[8] R. P. Singh, M. Javaid, A. Haleem, and R. Suman, ‘‘Internet of Things (IoT) Dec. 2019.
applications to fight against COVID-19 pandemic,’’ Diabetes Metabolic [30] W. Indrasari, E. Budi, Umiatin, S. Rizqy Alayya, and R. Ramli,
Syndrome, Clin. Res. Rev., vol. 14, no. 4, pp. 521–524, Jul. 2020. ‘‘Measurement of water polluted quality based on turbidity, pH, magnetic
[9] A. Khanna and S. Kaur, ‘‘Internet of Things (IoT), applications and property, and dissolved solid,’’ J. Phys., Conf. Ser., vol. 1317, no. 1,
challenges: A comprehensive review,’’ Wireless Pers. Commun., vol. 114, Oct. 2019, Art. no. 012060.
no. 2, pp. 1687–1762, Sep. 2020. [31] S. Kunal, A. Saha, and R. Amin, ‘‘An overview of cloud-fog computing:
[10] S. Y. Y. Tun, S. Madanian, and F. Mirza, ‘‘Internet of Things (IoT) Architectures, applications with security challenges,’’ Secur. Privacy,
applications for elderly care: A reflective review,’’ Aging Clin. Experim. vol. 2, no. 4, p. e72, Jul. 2019.
Res., vol. 33, no. 4, pp. 855–867, Apr. 2021. [32] F. Khan, M. A. B. Siddiqui, A. U. Rehman, J. Khan, M. T. S. A.
[11] H. Landaluce, L. Arjona, A. Perallos, F. Falcone, I. Angulo, and F. Asad, and A. Asad, ‘‘IoT based power monitoring system for smart grid
Muralter, ‘‘A review of IoT sensing applications and challenges using applications,’’ in Proc. Int. Conf. Eng. Emerg. Technol. (ICEET), Feb. 2020,
RFID and wireless sensor networks,’’ Sensors, vol. 20, no. 9, p. 2495, pp. 1–5.
Apr. 2020. [33] M. S. Novelan and M. Amin, ‘‘Monitoring system for temperature and
[12] A. Badshah, A. Ghani, A. Daud, A. Jalal, M. Bilal, and J. Crowcroft, humidity measurements with DHT11 sensor using nodeMCU,’’ Int. J.
‘‘Towards smart education through Internet of Things: A survey,’’ ACM Innov. Sci. Res. Technol., vol. 5, no. 10, pp. 123–128, 2020.
Comput. Surv., vol. 56, no. 2, pp. 1–33, Feb. 2024. [34] S. U. N. Goparaju, S. S. S. Vaddhiparthy, C. Pradeep, A. Vattem,
[13] N. Goswami, S. A. Sufian, M. S. Khandakar, K. Z. H. Shihab, and and D. Gangadharan, ‘‘Design of an IoT system for machine learning
M. S. R. Zishan, ‘‘Design and development of smart system for biofloc fish calibrated TDS measurement in smart campus,’’ in Proc. IEEE 7th World
farming in Bangladesh,’’ in Proc. 7th Int. Conf. Commun. Electron. Syst. Forum Internet Things (WF-IoT), Jun. 2021, pp. 877–882.
(ICCES), Coimbatore, India, Jun. 2022, pp. 1424–1432.
[35] P. Mishra, A. Biancolillo, J. M. Roger, F. Marini, and D. N. Rutledge,
[14] N. Rosaline and S. Sathyalakshimi, ‘‘IoT based aquaculture monitoring
‘‘New data preprocessing trends based on ensemble of multiple prepro-
and control system,’’ J. Phys., Conf. Ser., vol. 1362, Nov. 2019,
cessing techniques,’’ TrAC Trends Anal. Chem., vol. 132, Nov. 2020,
Art. no. 012071.
Art. no. 116045.
[15] K. K. Saha, A. Islam, S. S. Joy, I. Writwik, and K. Shikder, ‘‘Bio-floc
[36] V. R. Joseph, ‘‘Optimal ratio for data splitting,’’ Stat. Anal. Data Mining,
monitoring and automatic controlling system using IoT,’’ in Proc. IEEE
ASA Data Sci. J., vol. 15, no. 4, pp. 531–538, Aug. 2022.
Int. Conf. Internet Things Intell. Syst. (IoTaIS), Nov. 2021, pp. 15–21.
[37] C.-C. Chang, Y.-Z. Li, H.-C. Wu, and M.-H. Tseng, ‘‘Melanoma detection
[16] M. M. Islam, J. Uddin, M. A. Kashem, F. Rabbi, and M. W. Hasnat,
using XGB classifier combined with feature extraction and K-Means
‘‘Design and implementation of an IoT system for predicting Aqua
SMOTE techniques,’’ Diagnostics, vol. 12, no. 7, p. 1747, Jul. 2022.
fisheries using Arduino and KNN,’’ in Proc. Int. Conf. Intell. Hum.
Comput. Interact., vol. 12616, 2021, pp. 108–118. [38] M. A. Abid, M. F. Mushtaq, U. Akram, M. A. Abbasi, and F. Rustam,
[17] B. Mahmuda, E. Haque, A. Al Noman, and F. Ahmed, ‘‘Image processing ‘‘Comparative analysis of TF-IDF and loglikelihood method for keywords
based water quality monitoring system for biofloc fish farming,’’ in extraction of Twitter data,’’ Mehran Univ. Res. J. Eng. Technol., vol. 42,
Proc. Emerg. Technol. Comput., Commun. Electron. (ETCCE), Dec. 2021, no. 1, p. 88, Jan. 2023.
pp. 1–6. [39] A. J. Myles, R. N. Feudale, Y. Liu, N. A. Woody, and S. D. Brown, ‘‘An
[18] S. A. Mozumder and S. Sagar, ‘‘Smart IoT-biofloc water management introduction to decision tree modeling,’’ J. Chemometrics, vol. 18, no. 6,
system using decision regression tree,’’ 2021, arXiv:2112.02577. pp. 275–285, Jun. 2004.
[19] M. M. Rashid, A.-A. Nayan, S. A. Simi, J. Saha, M. O. Rahman, and [40] Y. Y. Song and Y. Lu, ‘‘Decision tree methods: Applications for
M. G. Kibria, ‘‘IoT based smart water quality prediction for biofloc classification and prediction,’’ Shanghai Arch. Psychiatry, vol. 27, no. 2,
aquaculture,’’ Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 6, pp. 470–477, p. 130, Apr. 2015.
2021. [Online]. Available: www.ijacsa.thesai.org [41] M. Belgiu and L. Drăguţ, ‘‘Random forest in remote sensing: A review of
[20] E. Blancaflor and M. Baccay, ‘‘Design of a solar powered IoT (Internet of applications and future directions,’’ ISPRS J. Photogramm. Remote Sens.,
Things) remote water quality management system for a biofloc aquaculture vol. 114, pp. 24–31, Apr. 2016.
technology,’’ in Proc. 3rd Blockchain Internet Things Conf., Jul. 2021, [42] T. Kam Ho, ‘‘Random decision forests,’’ in Proc. 3rd Int. Conf. Document
pp. 24–31. Anal. Recognit., Jun. 1995, pp. 278–282.

86344 VOLUME 12, 2024


M. A. Abid et al.: IoT-Based Smart Biofloc Monitoring System for Fish Farming

[43] W. S. Noble, ‘‘What is a support vector machine?’’ Nature Biotechnol., KASHIF MUNIR received the B.Sc. degree
vol. 24, no. 12, pp. 1565–1567, Dec. 2006. in mathematics and physics from The Islamia
[44] M. P. LaValley, ‘‘Logistic regression,’’ Circulation, vol. 117, no. 18, University of Bahawalpur, Pakistan, in 1999, the
pp. 2395–2399, 2008. M.Sc. degree in information technology from
[45] S. Jayachitra and A. Prasanth, ‘‘Multi-feature analysis for automated brain Universiti Sains Malaysia, in 2001, the M.S.
stroke classification using weighted Gaussian Naïve Bayes classifier,’’ J. degree in software engineering from the University
Circuits, Syst. Comput., vol. 30, no. 10, Aug. 2021, Art. no. 2150178. of Malaya, Malaysia, in 2005, and the Ph.D. degree
[46] A. H. Jahromi and M. Taheri, ‘‘A non-parametric mixture of Gaussian in informatics from the Malaysia University of Sci-
naive Bayes classifiers based on local independent features,’’ in Proc. Artif. ence and Technology, Malaysia, in 2015. He has
Intell. Signal Process. Conf. (AISP), Oct. 2017, pp. 209–212. been in the field of higher education, since 2002.
[47] O. Sagi and L. Rokach, ‘‘Approximating XGBoost with an interpretable After an initial teaching experience in courses with the Binary College,
decision tree,’’ Inf. Sci., vol. 572, pp. 522–542, Sep. 2021.
Malaysia, for one semester and with the Stamford College, Malaysia, for
[48] S. Anantharaj, S. R. Ede, K. Karthick, S. Sam Sankar, K. Sangeetha,
around four years, he later relocated to Saudi Arabia. He was with the
P. E. Karthik, and S. Kundu, ‘‘Precision and correctness in the evaluation of
King Fahd University of Petroleum and Minerals, Saudi Arabia, from
electrocatalytic water splitting: Revisiting activity parameters with a criti-
cal assessment,’’ Energy Environ. Sci., vol. 11, no. 4, pp. 744–771, 2018. September 2006 to December 2014. Then, he moved to the University of
[49] D. M. W. Powers, ‘‘Evaluation: From precision, recall and F- Hafr Al-Batin, Saudi Arabia, in January 2015. In July 2021, he joined
measure to ROC, informedness, markedness and correlation,’’ 2020, the Khwaja Fareed University of Engineering and IT, Rahim Yar Khan,
arXiv:2010.16061. where he is currently an Assistant Professor with the IT Department. He has
[50] S. A. Khan and Z. Ali Rana, ‘‘Evaluating performance of software defect published journal articles, conference papers, book, and book chapters. His
prediction models using area under precision-recall curve (AUC-PR),’’ research interests include cloud computing security, software engineering,
in Proc. 2nd Int. Conf. Advancements Comput. Sci. (ICACS), Feb. 2019, and project management. He has been in the technical program committee of
pp. 1–6. many peer-reviewed conferences and journals, where he has reviewed many
[51] C. Goutte and E. Gaussier, ‘‘A probabilistic interpretation of precision, research papers.
recall and f-score, with implications for evaluation,’’ in Proc. Eur. Conf.
Inf. Retr. Berlin, Germany: Springer, Mar. 2005, pp. 345–359.
[52] J. Wahl, J. Freyss, M. von Korff, and T. Sander, ‘‘Accuracy evaluation
and addition of improved dihedral parameters for the MMFF94s,’’
J. Cheminformatics, vol. 11, no. 1, pp. 1–10, Dec. 2019.
[53] J. Heckman, J. L. Tobias, and E. Vytlacil, ‘‘Four parameters of interest in HAFEEZ UR REHMAN SIDDIQUE received the
the evaluation of social programs,’’ Southern Econ. J., vol. 68, no. 2, p. 210, B.Sc. degree in mathematics from Islamia Uni-
Oct. 2001. versity Bahawalpur (IUB), Pakistan, in 1998, the
[54] I. Archakov and P. R. Hansen, ‘‘A new parametrization of correlation M.Sc. degree in computer science from Bahauddin
matrices,’’ Econometrica, vol. 89, no. 4, pp. 1699–1715, 2021. Zakariya University (BZU), Multan, Pakistan,
in 2000, and the Ph.D. degree in electronic
engineering from London South Bank University,
in April 2016. He was a Lecturer of computer
MUHAMMAD ADEEL ABID received the science in network with the Institute of Com-
M.C.S. and M.S.C.S. degrees from The Islamia puter Education (NICE), from 2001 to 2006. His
University of Bahawalpur, Bahawalpur, in research interests include biomedical and energy engineering applications,
2008 and 2015, respectively. He is currently data recognition, image processing, system embedded programming, and
pursuing the Ph.D. degree in computer science machine learning.
with the Khwaja Fareed University of Engineering
and Information Technology, Rahim Yar Khan.
He has been a Lecturer in computer science
with the Khwaja Fareed University of Engineering
and Information Technology, since April 2018,
accumulating nearly 16 years of teaching experience in various educational ANCA DELIA JURCUT (Member, IEEE) received
institutions. His research interests include predictive analysis on numeric the B.Sc. degree in computer science and math-
and textual data, pattern identification, and the development of data clusters ematics from the West University of Timisoara,
based on dataset characteristics. Additionally, he specializes in IoT-based Romania, in 2007, and the Ph.D. degree in security
solutions addressing real-time problems. He received the Gold Medal during engineering from the University of Limerick (UL),
the M.C.S. studies and actively takes on various responsibilities assigned by in 2013. She has been an Assistant Professor
the department and the university. with the UCD School of Computer Science, since
2015. She was a Postdoctoral Researcher with UL,
a member of the Data Communication Security
Laboratory, and a Software Engineer with IBM,
MADIHA AMJAD received the M.Sc. degree Dublin, in the area of data security and formal verification. She has
in electronics from Quaid-i-Azam University, recently acted as an Evaluator of H2020 proposals for the cryptography and
Pakistan, in 2008, the M.S. degree in computer cybersecurity call. Her Ph.D. study was funded by the Irish Research Council
engineering from the University of Engineering for Science Engineering and Technology (IRCSET). Her research interests
and Technology, Pakistan, in 2011, and the Ph.D. include security protocols design and analysis, mathematical modeling,
degree from the National University of Sciences automated techniques for formal verification, cryptography, computer
and Technology (NUST), Pakistan, in 2020. She is algorithms, security for Internet of Things, and blockchain security. Much
currently an Assistant Professor with the Khwaja of her work has focused on formal verification techniques for security
Fareed University of Engineering and Information protocols using deductive reasoning methods (modal logics and theorem
Technology (KFUEIT), Rahim Yar Khan. Her proving), automation of logics for formal verification, the development of
research interests include the design and optimization MAC layer schemes new logic-based techniques and tools for formal verification, the design and
for hybrid VLC/RF networks, WSNs and molecular nanonetworks, resource analysis of security protocols, and formalization and modeling of design
optimization, clustering in unmanned vehicular area networks (UAV), and requirements for security protocols.
design of IoT-based solutions to solve indigenous problems.

VOLUME 12, 2024 86345

You might also like