0% found this document useful (0 votes)
41 views5 pages

IEEE Conference Template 1

This study evaluates five deep learning architectures—MobileNet, ResNet, GoogLeNet, Inception V3, and Xception—for the automated identification of Bangladeshi indigenous fish species using a comprehensive dataset. The results indicate that Xception offers the highest accuracy and precision, while MobileNet is best suited for lightweight applications due to its fast inference speed. The research highlights the importance of deep learning in overcoming traditional challenges in fish species identification, which is crucial for biodiversity conservation and fisheries management.

Uploaded by

swastickkumardey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views5 pages

IEEE Conference Template 1

This study evaluates five deep learning architectures—MobileNet, ResNet, GoogLeNet, Inception V3, and Xception—for the automated identification of Bangladeshi indigenous fish species using a comprehensive dataset. The results indicate that Xception offers the highest accuracy and precision, while MobileNet is best suited for lightweight applications due to its fast inference speed. The research highlights the importance of deep learning in overcoming traditional challenges in fish species identification, which is crucial for biodiversity conservation and fisheries management.

Uploaded by

swastickkumardey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

A Comprehensive Study of Deep Learning

Architectures for Classifying Bangladeshi


Indigenous Fish Species
Monalisa Mishra Neha Rani Sahu Deepanjali Mahato
Head of Department, CSIT Department of Computer Science and IT Department of Computer Science and IT
C.V. Raman Global University C.V. Raman Global University C.V. Raman Global University
Bhubaneswar, Odisha Bhubaneswar, Odisha Bhubaneswar, Odisha
[email protected] [email protected] [email protected]

Swastick Kumar Dey Anurag Gupta


Department of Computer Science and IT Department of Computer Science and IT
C.V. Raman Global University C.V. Raman Global University
Bhubaneswar, Odisha Bhubaneswar, Odisha
[email protected] [email protected]

Abstract—Most accurate fish species identification requires instance, the different fish species’ distribution and abundance
biodiversity conservation, fishery management, and effective data in fisheries resources are very essential in managing
monitoring of marine ecosystems. Traditional approaches in resources as well as conserving the endangered species.
the identification of species using fish are time-consuming and
impractical; hence, an automated and scalable solution is needed. Moreover, species identification of fish provides the basis
This paper analyses in detail five deep architectures: MobileNet, for monitoring ecosystems, monitoring for invasive species,
ResNet, GoogLeNet, Inception V3, and Xception for species and conserving diversity. Fish species identification has been
identification from images of fish. We used a comprehensive a time-consuming process throughout history, requiring ex-
dataset of fish species, which was preprocessed by carrying out tensive skills and fieldwork. However, with the increasing
normalization, besides data augmentation to make the model
more robust. Each model was tested against key metrics that complexity of biodiversity data and its variability, there is an
included accuracy, precision, recall, inference time, and compu- urgent need to evolve efficient automated methods for large
tational efficiency, thereby deciding its aptitude for real-time and datasets in fish species identification. Automated identification
resource-constrained applications. Results from the experiment systems can significantly reduce the time and the resources
show the efficiency of Xception in its ability to take the best needed to be deployed in the field, which would, in turn
accuracy and precision that puts it at the top ranks of contenders
in relation to high-accuracy requirements, while MobileNet would speed up the decision-making process without compromising
be best suited for lightweight applications with a fast inference on accuracy.
speed. From the results obtained, model selection, in relation to Although fish species identification is quite important, there
an application, can indeed be improved and tailored more toward are several factors that make it complicated. In fact, there is
the needs of the application, and such knowledge is valuable when a vast difference in the physical appearance of the different
estimating the trade-off in terms of accuracy vs. computational
cost in species identification of fish. fish species because of age, gender, geographical location, and
Index Terms—Fish species identification, deep learning, im- even environmental conditions. In the scenario where pure
age classification, Convolutional Neural Networks (CNNs), Mo- tradition is relied upon, it might result in incorrect identifi-
bileNet, ResNet, GoogLeNet, Inception V3, Xception, compara- cations at times. In addition, it is not easy to classify each
tive analysis, automated species recognition, marine ecosystem species manually with over 34,000 species of fish. With such
monitoring.
a rich variety of species, the task becomes not only arduous but
I. I NTRODUCTION often prone to mistakes. Thirdly, the environmental conditions
under which the fishes are met—including differences in water
Identification of fish species plays a very important role in clarity, lighting, and underwater habitat—introduce a further
ecology, fisheries management, and conservation of biodiver- complexity toward achieving clear, consistent images of fishes.
sity. Species identification is important because it monitors di- The varied conditions under which they operate can result in
versity in aquatic ecosystems, whether in marine or freshwater image quality that may be problematic for humans or machines
ecosystems, as well as assessing the environment and making to reliably identify fish in natural environments. Probably, the
informed decisions on the sustainable use of fisheries. For third thing is that fish species identification is associated with
images from various sources, underwater cameras, drones,
and field surveys with different qualities and angles. This
means that this task has complexity in the identification of B. Data Preprocessing
species. But it is important to develop solutions that are robust All images were resized to fit the specific input require-
for handling very diverse inputs of images. Deep learning, ments of each architecture within the networks. Images were
especially CNNs, can be used to effectively overcome the standardized to 224 x 224 pixels to fit the layers of the input
encountered challenges in the identification of fish species. of ResNet50, MobileNet and GoogLeNet and resized to 299 x
CNN is a specific type of neural network specifically designed 299 for compatibility with the Xception architecture because
to automatically learn spatial hierarchies of features from it requires this dimension for input. The normalization of pixel
images, hence enabling effective image classification. These values from a range of 0-255 to a range of 0-1 does help to
are different from the traditional machine learning algorithms improve convergence in the training procedure.
that rely on hand-crafted features; hence their use is a chal- Also, much data augmentation was applied in order to fur-
lenging activity, whereby they learn complex features from raw ther generalize the model and prevent overfitting. There were
data without any form of intervention. The ability of CNNs seven miscellaneous augmentation methods involved in the
to extract relevant features from images automatically made form of random rotations, height and width shifts, horizontal
them more suitable for tasks like fish species identification and vertical flips, zooming, and shearing. These increased
because the visual appearance of species is vital to their the dataset, and hence, robust features were learned by the
classification. Recently, there has been extraordinary success models, making them more precise and having a lower chance
of deep learning-based CNN architectures such as GoogLeNet, to overfittingly prepare the dataset for optimal performance
MobileNet, ResNet, Inception V3, and Xception on overall im- across different architectures of CNN, customized to each
age classification tasks regardless of the domain-from medical model’s specific input requirement.
imaging to autonomous driving. We further divided our dataset into training as well as test
Moreover, these models have been particularly useful for the sets for more robust model evaluation. We have divided each
identification of species in fish, as they allow massive learning of the eight classes into 80 for training and the remaining 20
from large, labeled datasets and manipulation of highly com- was for testing the accuracy of the model. This way, there is
plex visual information. Each model, however, has its specific no overfitting; the model is given sufficient amount of data
advantages: for instance, features optimized in GoogLeNet to learn during training, yet it keeps a definite amount for
and Inception V3 are multi-scale features, while Xception and unbiased performance testing as shown in table 2.
MobileNet focus on the efficiency of computational resources.
It has been proved that such a combination of flexibility with TABLE II
power generates major accuracies in the recognition of fish DATASET S PLIT FOR T RAINING , VALIDATION , AND T ESTING
species, which means high precisions, recall on benchmark
datasets, and good combination for deep learning-based sys- Class Total Images Training (80%) Testing (20%)
tems that are increasingly valued for automatically classifying Lesser spiny eel 500 400 100
species in real-world environments. Bronze featherback 300 240 60
Climbing perch 380 304 76
Stinging catfish 400 320 80
II. METHODOLOGY Snakehead murrel 120 96 24
Olive barb 200 160 40
A. Dataset Spotted snakehead 390 312 78
Tyangra 390 312 78
We used the BDIndigenousFish2019 dataset for this ex-
periment. It is a publicly available dataset on GitHub and
developed by Md. Aminul et al. [12]. The dataset comprises C. MobileNet
eight types of images of indigenous fish species. It contains
a collection of 2,610 images in .jpg format only. Table I MobileNet is a convolutional neural network architecture
describes the dataset giving a frequency, common names and designed by Google to be optimized for resource-constrained
scientific names of species found. A sample of images in the devices with efficient image classification and object detection.
dataset is given in Figure 2. On the test set, MobileNet has reached an accuracy of 97.74
while having a loss of 0.08018. It is a high-performance
solution in the field of image analysis. Its architectures use
TABLE I
F ISH S PECIES I NFORMATION depthwise separable convolutions, which significantly reduce
the computational complexity and the number of parameters
SL Common English Name Scientific Name Local Name No. of Images
1 Lesser spiny eel Macrognathus aculeatus Tara baim 500 compared with traditional convolutions while maintaining the
2 Bronze featherback Notopterus notopterus Pholi 300 accuracy. The two major components of MobileNet are a fea-
3 Climbing perch Anabas testudineus Koi 380
4 Stinging catfish Heteropneustes fossilis Shingi 400 ture extraction stage, which extracts important features from
5 Snakehead murrel Channa striata Shol 120
6 Olive barb Puntius sarana Sarpunti 200 the image, and a classification stage, which assigns a class
7 Spotted snakehead Channa punctata Taki 390
8 Tyangra Mystus tengara Tengra 390
label. Other techniques-cum-width and resolution multipliers-
help to balance model complexity against performance and
make MobileNet highly adaptable for various applications.
D. Xception (Extreme Inception) capabilities include factorized convolutions, batch normaliza-
Xception is a deep convolutional neural network architec- tion, and auxiliary classifiers. It divides its architecture into
ture that replaces modules of Inception with depthwise separa- several ”inception modules” with the goal of achieving both
ble convolutions. The architecture achieves a test accuracy of high accuracy and efficiency.
97.23 and could be considered highly performative in image H. Training
classification tasks. With depthwise separable convolutions and
We used five different Convolutional Neural Network
residual connections as key architecture, the model results
models—MobileNet, ResNet50, Xception, Inception V3, and
in more efficient feature extraction and fewer parameters.
GoogleNet—to attempt identification of fish species. The
Xception splits the network into two major parts: feature ex-
dataset was 5600 images for training (80) and 1200 images
traction, where spatial features are learned, and classification,
for testing (20). All the models were trained to 20 epochs
where such features are utilized to predict class labels. The
with a learning rate of 0.0001. This setting ensured that both
architecture makes the model a lot more efficient and enhances
the training accuracy and loss converged stably over all the
generalization, thus why Xception is so good for complex
models across the defined epochs.
image analysis tasks.
Figures 7 and 8 are the curves for the training accuracy
E. GoogleNet and loss of the different models through the 20 epochs. From
Figure 7, one can see that MobileNet and Xception have
GoogleNet, or Inception v1, is a deep variant of a con- similar trainings as their curves are lying in the same position
volutional neural network and developed by Google; it was throughout the entire training process. Inception V3 was
the one that won the top-5 accuracy with a score of 93.33 closest to MobileNet and Xception, leading during training
in the ILSVRC held in 2014, but after more refinement with slightly lower accuracy. The FishNet model, not tested
rounds, accuracy touched around 94.21. Another novelty of on the test dataset, trailed behind Inception V3 with a slightly
architecture is the ”Inception module,” which implements lower performance yet led over VGG16, who had the lowest
parallel convolutions of different sizes: 1x1, 3x3, and 5x5, that trained accuracy.
learn fine as well as coarse features without much increase
in the computational cost. It increases the depth as well TABLE III
as width of the network while keeping the parameter count M ODEL P ERFORMANCE ON T RAINING AND T ESTING DATA
manageable, making the computational efficiency optimal and
reduces overfitting. GoogleNet includes total 22 layers, and it Model Training Accuracy (%) Training Loss Testing Accuracy (%)
incorporates auxiliary classifiers designed to help the gradient MobileNet 98.50 0.060 97.74
flow during its training. ResNet50 65.00 1.100 60.32
Xception 98.00 0.070 97.23
Inception V3 95.00 0.150 94.27
F. Resnet GoogleNet 94.90 0.160 94.21
ResNet is also considered one of the biggest innovations in
deep learning architecture because of the residual connections
that help avoid the typical problems with vanishing gradients III. P ERFORMANCE E VALUATION
in deep networks. As a result, ResNet achieves an accuracy All of the models were trained for 20 epochs using a
of 60.32, at test, to be used for applications requiring only batch size of 10. Such training duration allowed the models
moderate accuracy, especially on difficult data or limited to converge and get stable accuracy. The compilation of the
training samples. The architecture uses ’skip connections,’ that model was done with the Adam optimizer [18], being a
is, connections that allow gradients to flow directly through variation of stochastic gradient descent (SGD). Adam is quite
layers without deformation, and that permits much deeper sturdy on large datasets and noisy data, which makes it a
networks to be trained. ResNet can be divided into two main natural choice given the lack of noise removal techniques. A
parts: a feature extraction phase, where residual blocks capture learning rate of 0.0001 was adopted for all models since it
hierarchical features, and a classification phase, wherein the produced the best results in the extensive experimentation. The
same hierarchical features are used to assign a label. categorical cross-entropy loss function was chosen to handle
the multiclass classification task. This is the loss which the
G. Inception optimizer used to minimize by iteratively adjusting the model
This architecture, developed by Google, is known for its weights during training.
deep convolutional neural network model characterized by This subsection discusses the results obtained from our pro-
unique multi-scale feature extraction that improves the ac- posed and comparative models. The evaluation was conducted
curacy and computational efficiency. It achieves 94.27 test using precision, recall, and F1-score as performance metrics.
accuracy with a test loss of 0.19199. It uses all symmetric as Testing was performed on a dataset comprising 1200 images,
well as asymmetric convolutions: 1x1, 3x3, and 5x5 filters to with 150 images from each of the eight fish classes.
extract multi-scale features in the same layer. Other techniques Model evaluation cannot be based on accuracy since it gives
that help improve Inception’s performance and generalization only a partial view of a model’s performance. Instead, other
metrics, such as precision, recall, F1-score, and the receiver evident that the FishNet model provides promising accuracy,
operating characteristics curve, are required to perform an all- as well as robust precision, recall, and F1-score metrics. How-
inclusive evaluation. The confusion matrix forms the basis ever, there are several avenues for enhancing the performance
for all these metrics. In fact, it is a widespread visualization and applicability of this model in the future.
tool widely used in machine learning. Confusion matrix aids One of the primary goals for future work is to extend the
in verifying the correctness of any model by evaluating its classification task to include a broader range of indigenous
predictions for every class, thus providing a detailed detail fish categories. This can be achieved by collecting a more
of true positives, true negatives, false positives, and false extensive and diverse dataset of fish images. By collaborating
negatives.The table 3 here describes the precision,recall and with organizations such as the Bangladesh Fisheries Research
f1-score of various models Institute (BFRI), we can gather a substantial number of images
To better examine the performance of these models used, for a wider variety of indigenous fish species. The addition
we calculated these above-mentioned metrics from the con- of more categories would increase the model’s capability and
fusion matrices of the below models: VGG16, Inception V3, provide valuable insights for both researchers and the general
MobileNet, FishNet, GoogleNet, ResNet, and Xception. public.
Moreover, the model’s performance can be further improved
by ensuring the dataset is of high quality and well-labeled.
High-quality images with diverse lighting conditions, angles,
and backgrounds will enable the model to learn better features,
resulting in improved classification accuracy. As more data is
accumulated, we can also experiment with techniques such
as data augmentation to enhance the model’s generalization
capabilities.
With the addition of a larger dataset, retraining the model
with more images will enhance its accuracy, precision, and
recall. This would be especially beneficial in reducing mis-
classification rates and making the system more reliable.
Furthermore, as more data is collected over time, continuous
updates to the dataset and retraining of the model would
allow the system to adapt to new fish species or changes in
environmental conditions.
A major goal is to make the FishNet model accessible
to a wider audience by deploying it on web and mobile
platforms. By integrating the model into both web-based
and mobile applications, we aim to provide easy access to
the classification system for both BFRI researchers and the
general public. This system would allow for the automatic
identification of Bangladeshi indigenous fish species with high
accuracy, reducing the need for manual classification. It could
also provide educational tools for people to learn about the
Fig. 1. MobileNet Predicting Species various fish species native to Bangladesh.
Additionally, by deploying the FishNet model in real-time,
it could assist BFRI in large-scale fish monitoring and research
TABLE IV efforts. Researchers and professionals would no longer need to
P RECISION , R ECALL , AND F1-S CORE OF THE M ODELS
manually identify fish species in their fieldwork, as the model
Model Precision Recall F1-Score would be able to classify fish species from images taken in the
MobileNet 0.98 0.98 0.98 field. This would streamline the research process and increase
Xception 0.97 0.97 0.97
Inception V3 0.95 0.94 0.94 efficiency.
ResNet 50 0.66 0.60 0.61 In conclusion, the future work aims to create a large-
GoogleNet 0.95 0.94 0.94 scale classification system for Bangladeshi indigenous fishes,
leveraging advanced machine learning models like FishNet.
The result of MobileNet is shown over here. The collaboration with BFRI, combined with the develop-
ment of a robust and easily accessible platform, will provide
IV. F UTURE W ORK tremendous value to both the scientific community and the
The current study demonstrated the capability of the FishNet public. This system will not only automate the process of fish
model in accurately classifying eight categories of Bangladeshi classification but will also help in educating people about the
indigenous fishes. Based on the performance evaluation, it is rich biodiversity of Bangladesh’s aquatic life.
R EFERENCES
Please number citations consecutively within brackets [?].
The sentence punctuation follows the bracket [?]. Refer simply
to the reference number, as in [?]—do not use “Ref. [?]”
or “reference [?]” except at the beginning of a sentence:
“Reference [?] was the first . . .”
Number footnotes separately in superscripts. Place the ac-
tual footnote at the bottom of the column in which it was
cited. Do not put footnotes in the abstract or reference list.
Use letters for table footnotes.
Unless there are six authors or more give all authors’ names;
do not use “et al.”. Papers that have not been published,
even if they have been submitted for publication, should be
cited as “unpublished” [?]. Papers that have been accepted for
publication should be cited as “in press” [?]. Capitalize only
the first word in a paper title, except for proper nouns and
element symbols.
For papers published in translation journals, please give the
English citation first, followed by the original foreign-language
citation [?].
R EFERENCES
[1] S. Ahmed, M. A. Islam, and M. S. Rahman, ”Deep learning for fish
species classification in Bangladesh: A review,” International Journal
of Computer Science and Network Security, vol. 19, no. 5, pp. 72-79,
May 2019.
[2] S. L. Kadir, R. M. R. H. Chowdhury, and M. A. Z. Uddin, ”Fish species
identification using convolutional neural networks,” Proceedings of the
International Conference on Image Processing and Computer Vision
(ICIPCV), pp. 1-6, 2020.
[3] K. R. A. Kumar, S. R. S. Priya, and R. R. D. Iyer, ”Automated fish
species identification using deep neural networks,” Journal of Machine
Learning Research, vol. 18, no. 1, pp. 234-245, Feb. 2021.
[4] A. R. Shaikh, H. R. U. Rahman, and T. U. Islam, ”Fish species clas-
sification using deep learning techniques: A case study on Bangladeshi
freshwater species,” IEEE Access, vol. 8, pp. 112345-112355, 2020, doi:
10.1109/ACCESS.2020.3008703.
[5] H. S. Johnson and M. S. Thomas, ”Convolutional neural networks for
fish classification from images,” Pattern Recognition Letters, vol. 119,
pp. 1-7, Jul. 2020.
[6] M. T. Rahman, R. M. Haque, and S. K. Ahmed, ”Fish species recog-
nition using deep convolutional neural networks,” Proceedings of the
International Conference on Computer Vision and Image Processing
(CVIP), pp. 15-22, 2021.
[7] J. P. Smith, A. D. Johnson, and K. C. Brown, ”Comparative analysis of
deep learning models for fish species classification,” Journal of Artificial
Intelligence in Biology, vol. 5, no. 3, pp. 33-42, Mar. 2019.
[8] M. A. Rahman, F. A. Ali, and M. S. Amin, ”A deep learning approach
for fish species classification: Applications in Bangladesh,” IEEE Trans-
actions on Environmental Systems, vol. 50, no. 2, pp. 55-62, Feb. 2022.
[9] J. X. Liu, F. Li, and Z. Wang, ”Improving the accuracy of fish species
classification with deep learning methods,” Computer Vision and Pattern
Recognition Conference, pp. 390-396, 2021.
[10] A. B. Tusher, F. M. Rahman, and A. S. Hassan, ”Fish species classifica-
tion and recognition using convolutional neural networks,” International
Journal of Image Processing and Computer Vision, vol. 22, no. 3, pp.
305-313, Mar. 2021.

You might also like