0% found this document useful (0 votes)

5 views12 pages

A Study On Enhancement of Fish Recognition

This study proposes a method to enhance fish recognition in underwater video images by accumulating classification results from previous frames using the YOLO network. The method achieved high classification probabilities of 93.94% for Bluegill and 97.06% for Largemouth bass, demonstrating its effectiveness in challenging underwater environments. By mimicking human visual processing, the approach improves object classification and counting in sequential video images.

Uploaded by

daniel.cristobalm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views12 pages

A Study On Enhancement of Fish Recognition

Uploaded by

daniel.cristobalm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Journal of

Marine Science
and Engineering

Article
A Study on Enhancement of Fish Recognition Using
Cumulative Mean of YOLO Network in Underwater
Video Images
Jin-Hyun Park 1 and Changgu Kang 2, *
1 Department of Mechatronics Engineering, Gyeongnam National University of Science and Technology,
Jinju-si 52795, 33 Dongjin-ro, Gyeongsangnam-do, Korea; [email protected]
2 School of Computer Engineering, Gyeongnam National University of Science and Technology,
Jinju-si 52795, 33 Dongjin-ro, Gyeongsangnam-do, Korea
* Correspondence: [email protected]; Tel.: +82-55-751-3321

Received: 15 October 2020; Accepted: 19 November 2020; Published: 22 November 2020

Abstract: In the underwater environment, in order to preserve rare and endangered objects or
to eliminate the exotic invasive species that can destroy the ecosystems, it is essential to classify
objects and estimate their number. It is very difficult to classify objects and estimate their number.
While YOLO shows excellent performance in object recognition, it recognizes objects by processing
the images of each frame independently of each other. By accumulating the object classification results
from the past frames to the current frame, we propose a method to accurately classify objects, and
count their number in sequential video images. This has a high classification probability of 93.94%
and 97.06% in the test videos of Bluegill and Largemouth bass, respectively. The proposed method
shows very good classification performance in video images taken of the underwater environment.

Keywords: exotic invasive species; object classification; video image; YOLO

1. Introduction
Techniques for classifying and estimating populations in aquatic ecosystems are important and
essential for conserving rare and endangered populations and eliminating exotic species that destroy
ecosystems. In general, for small individuals, the number is estimated either by direct counting or by
using the cross-line method [1,2] or the mark collection method [3,4]. In the case of large numbers of
them, we generally use a camera, and must make efforts to count individuals directly from camera
images or video images [5,6]. It is very difficult to classify populations and estimate the number of
individuals in any method.
Convolutional Neural Network (CNN) is widely used for object recognition and classification, and
shows very good results. Many methods have been proposed based on the principles of CNN, and their
performance has been demonstrated in various fields [7–10]. However, some studies in the field have
been conducted using CNN [11–14]. The issue of image classification began in AlexNet [8] and further
research has been carried out in GoogLeNet and VGGNet [15,16]. ResNet, which appeared in 2015,
outperformed human judgment [17]. Based on these studies, research has focused not only on the image
classification problem, but also on the image detection problem that classifies various objects of the
image into specific classes, and predicts the location of the specific objects [10]. R-CNN [18,19], which
shows good performance in image detection problems, creates potential bounding boxes on an image,
and then runs a classifier on the proposed bounding boxes. After the classification, post-processing is
used to refine the bounding boxes, eliminate duplicate detection, and calculate classification scores
based on other objects [20]. In contrast, You Only Look Once (YOLO) [20–22] is the fastest system to

J. Mar. Sci. Eng. 2020, 8, 952; doi:10.3390/jmse8110952 www.mdpi.com/journal/jmse

J. Mar. Sci. Eng. 2020, 8, 952 2 of 12

detect and classify various objects. YOLO is a simple structure with a single convolutional network that
simultaneously predicts bounding boxes and classification probabilities. The much faster operating
time allows real-time processing, and plays a role in filtering the background image by reasoning
globally. Furthermore, a general CNN cannot classify multiple objects in one image, but the YOLO
network can classify multiple objects using a bounding box. This is useful for recognizing different
objects in a single image and counting the number of those that are recognized. In particular, it
can be used very effectively for classifying populations and estimating the number of individuals in
video images.
However, despite the advantages of YOLO, it is difficult to obtain accurate results in every frame
for low illumination or unfocused images, such as video images in an underwater environment [11,23].
To solve this problem, a data collection method has been proposed and has become a very useful
alternative [11]. This method can improve the classification performance of images, but it is difficult to
classify multiple objects in a single image or perform real-time processing for counting the number
of objects. The human visual system can classify objects by constantly looking at them. In contrast,
YOLO does not use sequential image but it processes images in each frame individually. YOLO can
process video images in real-time, but independently classifies object’s locations and classes using only
one frame at a time. This means that the classification results of the previous frame image do not affect
those of the current frame image.
Therefore, we propose the method to accurately classify objects and count the number of objects
in a video image by accumulating the classified results from the past frame to the current frame.
The proposed method may degrade the classification performance of some frames depending on the
underwater environment, but this disadvantage is compensated by applying the cumulative average.
This is a heuristic method that mimics human experience and learning.
In this study, we use YOLOv2 [21] for object recognition in video images taken in the underwater
environment, and we apply the human heuristic approach by accumulating the mean of classification
results of past frames to increase object classification results and count the number of objects accurately.
We verified that the proposed method improves the classification and counting of objects in video images.

2. YOLO and Learning Data

2.1. YOLO
There are many studies on how to apply CNN to classify objects in unedited real-time video
images [22,24–27]. In order to apply CNN, it is necessary to crop an image to fit the input size of
CNN [24–26]. Recent studies have used a saliency map [26,27] to select the region to crop an image.
However, when using a saliency map, processing time and performance vary depending on the number
of filters. This is the most important factor in real-time processing. In the case of YOLO, there is no
need to crop the input image for object recognition. Additionally, it has a structure and processing
time that are suitable for real-time processing. YOLO handles bounding boxes and class probabilities
at more than 45 fps over the entire image, making it very fast. Furthermore, if there are no objects or
they are not subject to classification in the image, it is less likely to detect the wrong object [20–22].
However, from the viewpoint of real-time processing, it is difficult to derive the accurate result in
every frame from the video images with unfocused or low illumination. If YOLO outputs accurate
classification results in every frame, the proposed method may not be necessary.

2.2. Learning Data and Video Image

For the learning data and the performance evaluation of this study, we needed image data for
learning and video images taken in the underwater environment. It was also difficult to construct a lot of
image data for the same kind of object for learning, and to obtain video images taken in the underwater
environment. Although this study can be applied to various kinds of object classification, we selected
fish that are easy to shoot in the underwater environment. Therefore, in this study, video images of fish
J. Mar. Sci. Eng. 2020, 8, 952 3 of 12

J. Mar. Sci. Eng. 2020, 8, x FOR PEER REVIEW 3 of 12

were taken directly in the laboratory environment, and learning data images of fish species were made
based on theinclude
classified captured video images
Largemouth bass, [28]. TheCommon
Bluegill, fish species to Crucian
carp, be classified
carp,include
Catfish,Largemouth
Mandarin fish,bass,
Bluegill,
and SkinCommon carp,
carp. The Crucian
seven carp, Catfish,
fish images Mandarin
for learning havefish, and Skin
a similar carp. The
streamline seven
shape, andfish
theimages
imagefor
learning have aofsimilar
environment streamline
the objects shape, and
to be classified the image
shows environment
very different of the objects
characteristics to be classified
depending on the
shows very different
underwater characteristics
environment [28]. We useddepending on the underwater
5000 [image/fish species] asenvironment
basic learning[28].
dataWefor used 5000
YOLO.
[image/fish species]
Figure 1 shows theas basic fish
labeled learning
in fishdata for YOLO. Figure 1 shows the labeled fish in fish images.
images.

(a) (b) (c) (d)

(e) (f) (g)

Figure
Figure 1. The
1. The labeled
labeled fish—(a)
fish—(a) Bluegill,
Bluegill, (b) Catfish,
(b) Catfish, (c) Common
(c) Common carp, (d)carp, (d) carp,
Crucian Crucian carp, (e)
(e) Largemouth
Largemouth
bass, bass,fish
(f) Mandarin (f) Mandarin fishcarp.
and (g) Skin and (g) Skin carp.

Using
Using anchor
anchor boxes
boxes to help
to help predict
predict the the position
position andand the size
the size of objects
of objects in an in an image
image in deep in learning
deep
learning systems increases the speed and efficiency of object detection. Even in
systems increases the speed and efficiency of object detection. Even in YOLO networks, anchor boxes YOLO networks,
areanchor boxes
a set of are a set
bounding of bounding
boxes boxesheights
with defined with defined heightsThese
and widths. and widths.
boxes These boxes to
are defined arecapture
definedthe
to capture theand
magnification magnification
aspect ratio and
of aspect ratio of
the specific the specific
object classesobject classes
that are that
to be are to beand
detected, detected, and
are typically
are typically selected according to the size of the objects of the learning data set, as shown
selected according to the size of the objects of the learning data set, as shown in Figure 1. In this study, in Figure
the1.average
In this study, the average
Intersection of UnionIntersection
(IoU) wasof Union (IoU)
calculated was calculated
by k—mean by k—mean
clustering of variousclustering
boundingofbox
various bounding box sizes, and k=4 with an average IoU of more than 0.74 was selected.
sizes, and k = 4 with an average IoU of more than 0.74 was selected.
YOLO was learned using YOLOv2 provided by MATLAB. The optimization method for
YOLO was learned using YOLOv2 provided by MATLAB. The optimization method for YOLOv2
YOLOv2 used Stochastic Gradient Descent with Momentum (SGDM), the initial learning rate was
used Stochastic Gradient Descent with Momentum (SGDM), the initial learning rate was set to 1.0 × e−4 ,
set to 1.0 × e , and the size of the mini-batch was set to 256. For the hardware devices, CPU (Intel
and the size of the mini-batch was set to 256. For the hardware devices, CPU (Intel i9-7900 3.30 GHz)
i9-7900 3.30 GHz) and four GPUs (NVIDIA GeForce GTX1080Ti) were used. Figure 2 shows the
and four GPUs (NVIDIA GeForce GTX1080Ti) were used. Figure 2 shows the learning results of
learning results of YOLOv2. In the case of catfish, the average precision is 82 %. Six species of fish,
YOLOv2. In the were
except catfish, case learned
of catfish, themore
with average
than precision is 82%. Six species of fish, except catfish, were
93 % precision.
learned with more than 93% precision.
After learning YOLO, a heuristic method was applied to classify objects from video images taken
in the underwater environment. Figure 3 shows the installed underwater photography system for
test video images. Since there are many floats in the aquatic environment, the video image changes
according to the change of sun and external light. Our underwater photography system was equipped
with wireless communication, and transmitted classified fish images and classification probabilities.
Therefore, we needed a method that can accurately classify fish and count the number of classified fish
in video images.
J. Mar. Sci. Eng. 2020, 8, 952 4 of 12
J. Mar. Sci. Eng. 2020, 8, x FOR PEER REVIEW 4 of 12

Figure 2. Learning results.

After learning YOLO, a heuristic method was applied to classify objects from video images
taken in the underwater environment. Figure 3 shows the installed underwater photography system
for test video images. Since there are many floats in the aquatic environment, the video image
changes according to the change of sun and external light. Our underwater photography system was
equipped with wireless communication, and transmitted classified fish images and classification
probabilities. Therefore, we needed a method that can accurately classify fish and count the number
of classified fish in video images. Figure 2. Learning results.
Figure 2. Learning results.

Figure 3. Underwater photography system.

3. Proposed method
3. Proposed Method

3.1. Heuristic Method

3.1. Heuristic Method
Humans
Humans can can classify
classify objects
objects byby looking
looking onlyonly once
once atat some
some objects
objects or
or objects
objects in
in the
the video
video image,
image,
but in general, humans recognize objects in succession, classify objects,
but in general, humans recognize objects in succession, classify objects, and accurately classify and accurately classify objects
with
objects sequential imagesimages
with sequential using information
using informationfrom their
from experience and learning.
their experience For For
and learning. example,
example, if anif
animal suddenly appears
an animal suddenly appears inFigure in a dark forest, few
a dark3. forest, people
few identify
people it
identify precisely
it from
precisely the
from beginning.
the At
beginning. first,
At
Underwater photography system.
they
first, are
theynot
aresure
not ifsure
it isifaitparticular animal.
is a particular However,
animal. if the ifanimal
However, appears
the animal in close
appears proximity,
in close most
proximity,
people
3. Proposed
most will recognize
peoplemethod it as a particular animal. In this way, the proposed method
will recognize it as a particular animal. In this way, the proposed method sequentially sequentially applies
real-time
applies real-time to
images YOLO.
images to ItYOLO.
computes the classification
It computes probability
the classification of an object
probability of anin object
the heuristic
in the
method
3.1. ofmethod
Heuristic
heuristic computing
Method the cumulative
of computing mean by accumulating
the cumulative the outputsthe
mean by accumulating of YOLO.
outputsThis guarantees
of YOLO. This
higher
guaranteesclassification
higher performance performance
classification by classifying by objects using the
classifying cumulative
objects using mean
the of sequential
cumulative mean object
of
Humansvalues,
classification can classify
than objects
by theirby looking
using CNNonly once atfrom
or YOLO some objectsimage
a single or objects in the video
containing them. image,
but in In general,
YOLO, the humans recognize
classification resultobjects
for eachin object
succession, classify by
is represented objects, and accurately
probability classify
values. Assuming
objects with sequential images using information from their experience and
these classification results follow a normal distribution, as the number of samples for the same object learning. For example, if
an animal suddenly appears in a dark forest, few people identify it precisely
increases by the central limit theorem, it is known that the mean of the classified sample means is equal from the beginning. At
first, they are not sure if it is a particular animal. However, if the animal appears in close proximity,
most people will recognize it as a particular animal. In this way, the proposed method sequentially
applies real-time images to YOLO. It computes the classification probability of an object in the
heuristic method of computing the cumulative mean by accumulating the outputs of YOLO. This
guarantees higher classification performance by classifying objects using the cumulative mean of
J. Mar. Sci. Eng. 2020, 8, 952 5 of 12

to the mean of the population, and the standard error of sample means decreases with the number of
samples, as shown in Equation (1) [29,30]:
σ
sx = √ (1)
n

where, sx is the standard error of the sample means, σ is the standard deviation of the population, and
n is the number of samples.
Therefore, the more classified samples of the same object in the video images, the higher the
confidence level for the classified object. In general, the time for recognizing and disappearing objects
in video images is not constant, but assuming that at least 1 s or more is measured, a sample means of
30 frames or more may be detected. In
√ the case of using
the sample means of 30 frames or more, the
standard error is reduced to sx < σ/ 30 = 0.1825σ .
By applying the heuristic method to YOLO, our method can maintain higher accuracy than
CNN or YOLO classification results, which use one frame for the object classification of video images.
This is because our method has low standard error depending on the number of frames, as shown in
Equation (1). The mean of the sample means was calculated as the cumulative mean for each object
using the classification results for successive images. The mean of the sample means was calculated as
the cumulative mean for each object using the classification results for successive images, as shown in
Equation (2).
(i − 1) pi (k)
Avgi (k) = Avgi−1 (k) + , (2)
i i
where, i denotes the number of frames, and k denotes a classification object, so Avgi (k) is the cumulative
mean for i and k, and pi (k) is the probability of classification for i and k.

3.2. Cumulative Mean of The YOLO Network

We describe how to enhance recognition using the cumulative mean of the YOLO network. The
proposed method uses YOLO for object recognition, and uses the heuristic approach to improve object
classification results. Figure 4 shows the overall flow for the proposed method. Our method uses
YOLO to recognize fish from all frame images of the video. Furthermore, it calculates the cumulative
mean using the heuristic method when the fish were recognized. Next, the number of fish is counted.
The method for counting the number of fish is to increase the number of fish each time the fish
disappears after the fish is recognized for a certain period of time within the capture area of the image.
If fish have not been recognized in the capture region of the image for a certain period of time, they are
not classified, as they are considered less reliable.
Figure 5 shows the capture region, and the capture lines are adaptively set according to the size
of the detected object, as shown in Equation (3). If the object size is large, the bounding box of the
recognized object is large, and the recognition probability is also high. Therefore, the capture region is
set to narrow, so that the object can be classified when the center of the object is only a little away from
the center of the image. If the object is small, the capture region is set to wide, so that the object can
be classified when the center of the object is far from the center of the image. When YOLO did not
recognize fish over 20 frames after the fish was recognized in the capture region, it is assumed that the
fish disappeared in the other direction, and we classified the fish and calculated the number of fish:

cl = A/(wl × wh ), cs 5 cl 5 cw (3)

where, A is any constant, wl is the width of the bounding box of the object, wh is the height of the
bounding box, cl is the width of the capture region, cs is the width of the minimum capture region, and
cw is the width of the maximum capture region.
J. Mar. Sci. Eng. 2020, 8, 952 6 of 12
J. Mar. Sci. Eng. 2020, 8, x FOR PEER REVIEW 6 of 12

Figure 4. The flow for cumulative mean of the YOLO network.

Figure 5 shows the capture region, and the capture lines are adaptively set according to the size
of the detected object, as shown in Equation (3). If the object size is large, the bounding box of the
recognized object is large, and the recognition probability is also high. Therefore, the capture region
is set to narrow, so that the object can be classified when the center of the object is only a little away
from the center of the image. If the object is small, the capture region is set to wide, so that the object
can be classified when the center of the object is far from the center of the image. When YOLO did
not recognize fish over 20 frames after the fish was recognized in the capture region, it is assumed
that the fish disappeared in the other direction, and we classified the fish and calculated the number
of fish:
/ × , ≦ ≦ (3)
where, is any constant, is the width of the bounding box of the object, is the height of the
bounding box, is the width of the capture region, is the width of the minimum capture region,
and is the width of the maximum capture region.
Figure 5. Capture lines.
Figure 5. Capture lines.
4. Experiments
4. Experiments
The Largemouth bass and the Bluegill video images taken in the pond were used for the
performance evaluation. In general CNN, classification of objects is conducted by one frame, so the
classification performance and recognition rate differ, depending on learning. In particular, as the
video images are underwater, it is sensitive to changes in sunlight or external lighting, and thus a
secondary method of recognizing fish in a single frame is required [20–22]. In addition, if an object
other than the object to be classified in the video image is captured and input to CNN, CNN has the
disadvantage of forcibly classifying it as a fish species. YOLO also classifies objects for a single image,
which can degrade classification performance depending on the learning; and it is difficult to obtain
accurate classification results in every frame.
Firstly, an evaluation of the proposed method was performed on Largemouth bass. In the video
images of 34 Largemouth bass, the proposed method classified 33 Largemouth bass (97.06%), with
one classified as an object. Figure 6 shows the classification probabilities of 33 Largemouth bass, and
recognized with a value of 60% or greater.

Figure 5. Capture lines.

4. Experiments
disadvantage
image, which ofcanforcibly
degrade classifying it as performance
classification a fish species. YOLO also
depending onclassifies objects
the learning; andforit is
a single
difficult to
image, which can degrade classification performance
obtain accurate classification results in every frame. depending on the learning; and it is difficult to
obtain accurate classification results in every frame.
Firstly, an evaluation of the proposed method was performed on Largemouth bass. In the video
Firstly, an evaluation of the proposed method was performed on Largemouth bass. In the video
images of 34 Largemouth bass, the proposed method classified 33 Largemouth bass (97.06 %), with
images of 34 Largemouth bass, the proposed method classified 33 Largemouth bass (97.06 %), with
one classified as an object. Figure 6 shows the classification probabilities of 33 Largemouth bass, and
J. Mar.
oneSci. Eng. 2020,
classified as8,an
952
object. Figure 6 shows the classification probabilities of 33 Largemouth bass, and 7 of 12
recognized with a value of 60 % or greater.
recognized with a value of 60 % or greater.

Figure
Figure 6. 6.
Figure
6. The
The probability
The of
ofLargemouth
probability
probability of bass.
Largemouth
Largemouth bass.
bass.

Figure
Figure
Figure 77 shows
7 showsshowsthethe
theclassification
classification
classification results
results andand
results frame
andframe
images
frame images of YOLOv2
of YOLOv2
images whenwhen
of YOLOv2 thewhenthe proposed
proposed the method
proposed
method
finally
method finally
classifies classifies
classifiesLargemouth
finallyLargemouth bass as 0.83.
Largemouth bass as as
It took
bass 0.83.
322 Itframes
0.83. took
It took322
untilframes
322 the untiluntil
Largemouth
frames the the
Largemouth
bass appeared
Largemouth bass on
bass
appeared
theappeared
right edge on the
onand right edge
disappeared
the right and
edge and disappeared
todisappeared to the
the left edge,toand left
thethe edge,
leftframeand the
edge, image frame
and thewas image
frame was
recognized
image was recognized as
as arecognized
Mandarin as
aa Mandarin
fish for frame 1fish
Mandarin to for
fish forframe
frame frame 1 1toto
22 but frame
was 2222but
correctly
frame butwas correctly
recognized
was asrecognized
correctly Largemouth
recognized as Largemouth
bass in frames
as Largemouth bass in frames
after
bass frame 23.
in frames
after
In after frame 23.method,
the proposed In the proposed method,performance
the classification performance is cumulative
representedaverage
by the
frame 23. In thethe classification
proposed method, the classificationis represented by the
performance is represented by ofthe
cumulative average of the classification performance up to the last frame, even if the classification is
thecumulative
classification performance
average up to the lastperformance
of the classification frame, even ifup thetoclassification
the last frame, is made
even if wrong up to frame is
the classification
made wrong up to frame 22. Therefore, the proposed method is less likely to yield incorrect
22.made
Therefore,
wrong theupproposed
to frame method is less likely
22. Therefore, theto yield
proposedincorrect classification
method results.toInyield
is less likely particular, it
incorrect
classification results. In particular, it has a high classification probability for very slow-moving fish.
has a high classification probability for very slow-moving fish. Figure 7i,j indicate
classification results. In particular, it has a high classification probability for very slow-moving fish. the classification
Figure 7i,j indicate the classification probability of YOLOv2 and the proposed method, respectively,
probability
Figure 7i,jofindicate
YOLOv2 and
the the proposed
classification method, respectively,
probability of YOLOv2 and for each frame. It can
the proposed be seen
method, that the
respectively,
for each frame. It can be seen that the proposed method accurately recognizes a Largemouth bass
proposed
for each method
frame. accurately
It can be recognizes
seen that thea proposed
Largemouth bass after
method frame 37.
accurately recognizes a Largemouth bass
after frame 37.
after frame 37.

(a) (b) (c) (d)

J. Mar. Sci. Eng. 2020, 8, x FOR PEER REVIEW 8 of 12

(e) (f) (g) (h)
(e) (f) (g) (h)

(i) (j)
Figure
Figure 7. Classification
7. Classification performances
performances for video
for video frames:
frames: Largemouth
Largemouth bassbass (83(a)
(83%). %).Frame
(a) Frame
1, (b)1,Frame
(b)
Frame
46, (c) 46, 91,
Frame (c) Frame 91, (d)
(d) Frame 136,Frame 136, (e)
(e) Frame Frame
181, 181, (f)
(f) Frame Frame
226, 226, (g)
(g) Frame Frame
271, 271 ,(h)322,
(h) Frame Frame 322, (i)
(i) YOLOv2,
andYOLOv2 , and (j) the
(j) the proposed proposed method.
method.

Figure
Figure 8 shows
8 shows thethe YOLOv2classification
YOLOv2 classificationresults
results of
of each
each frame
frame for
for one
one fish
fishwhose
whoseproposed
proposed
method does not recognize Largemouth bass. A Largemouth bass appears at the top left of the of
method does not recognize Largemouth bass. A Largemouth bass appears at the top left the
camera,
camera, comes very close, and disappears to the top right. YOLOv2 misclassified it as
comes very close, and disappears to the top right. YOLOv2 misclassified it as Common carp for frame Common carp
9 tofor frame
frame 15,9 after
to frame 15, after
which it didwhich it did not any
not recognize recognize any fish. YOLOv2
fish. YOLOv2 did not classify
did not classify correctly,
correctly, because
eachbecause each frame image did not show the overall outline of the fish, but instead only one part. In
frame image did not show the overall outline of the fish, but instead only one part. In the case of
the case of such video images, CNN and YOLOv2 show the wrong classification results and count of
such video images, CNN and YOLOv2 show the wrong classification results and count of the number
the number of fish species. The proposed method may also not classify fish species. However, it does
not count individuals for misclassification results.
Figure 8 shows the YOLOv2 classification results of each frame for one fish whose proposed
method does
Figure not recognize
8 shows the YOLOv2 Largemouth bass.
classification A Largemouth
results of each frame bass
for appears
one fish at the top
whose left of the
proposed
method
camera, does
comes not recognize
very close, andLargemouth
disappears bass. A Largemouth
to the bass appears
top right. YOLOv2 at the top
misclassified it asleft of the carp
Common
camera,
for frame comes
9 to very
frame close, and disappears
15, after which it didto not
the top right. YOLOv2
recognize any fish.misclassified
YOLOv2 did it as
notCommon
classify carp
correctly,
for frame
because
J. Mar.
9
each
Sci. Eng.
to frame
2020,frame
15, after which it did not recognize any fish. YOLOv2 did not classify
8, 952 image did not show the overall outline of the fish, but instead only one 8part.
correctly, of 12 In
because
the caseeach
of suchframe image
video did not
images, show
CNN andtheYOLOv2
overall outline of the
show the fish, classification
wrong but instead only one and
results part.count
In of
the
thecase of such
number video
of fish images,
species. CNN
The and YOLOv2
proposed method show
maythe wrong
also classification
not classify results and
fish species. count of
However, it does
of the
fishnumber
not species.
count ofThe
fish proposed
individualsspecies.
for The proposed
method may method
misclassificationalso may
not
results. also not
classify classify
fish fishHowever,
species. species. However,
it does notit does
count
not count individuals
individuals for misclassification
for misclassification results. results.

(a)
(a) (b)(b) (c) (c) (d) (d)
Figure
Figure
Figure 8.Classification
Classification
8.8.Classification of YOLOv2
ofofYOLOv2
YOLOv2 forfor
for Largemouth
Largemouth
Largemouth bass
bass
bass video:
video:
video: NonNon
Non detection.
detection.
detection.(a)(a) (a)
Frame Frame
Frame1, (b) 1, (b)
Frame
1, (b) Frame
Frame
9,(c)
9, 9,
(c) (c) Frame
Frame
Frame 15,
15,
15, and
and
and (d)(d)
(d) Frame
Frame
Frame 63.63.
63.

Second,
Second,
Second, wewewe evaluated
evaluated
evaluated thethe
the proposed
proposed
proposed method
method
method forfor
for Bluegill.
Bluegill.
Bluegill. The
TheThe proposed
proposed
proposed method
method
method recognizes
recognizes
recognizes 62 62 62
(which
(which is
is 93.94
93.94 %)%)ofofa a total
total of of
66 66 fish
fish as as Bluegill,
Bluegill, and and
did did
not not classify
classify 4 4 Bluegills.
Bluegills. Most Most
(which is 93.94%) of a total of 66 fish as Bluegill, and did not classify 4 Bluegills. Most of the 62 Bluegillsof theof62the 62
Bluegills
Bluegills
were wererecognized
were
recognized recognized
with with
with
classification classification
classification
probabilities probabilities
probabilities
of more than of more
of 60%,
more than
as than
60 %,
shown 60 %,shown
inas as shown
Figure 9. in in Figure
Figure
The 9.
proposed 9.
The
The proposed
proposed method
method using
using thetheheuristic
heuristicmethod
methodshows
shows a very
a high
very recognition
high rate
recognition
method using the heuristic method shows a very high recognition rate for the detection of fish, and for
rate the
for the
detection
candetection ofoffish,
accurately fish,and
count and
thecan
canaccurately
accurately
population count
of count
fish.thethe
population
populationof fish.
of fish.

Figure 9. The probability of bluegill.

Figure 9. The
Figure probability
9. The of bluegill.
probability of bluegill.
Figure 10 shows the classification results and frame images of YOLOv2, with the lowest
Figure
Figure10 10
shows
shows thethe classification results and frame images of of YOLOv2, with thethe lowest
classification probability of 27classification
% for Bluegill.results and
The Bluegill frame
appearsimages YOLOv2,
from the bottom right,with
disappearslowest
classification probability
classification probability of of
27%27 for
% Bluegill.
for TheThe
Bluegill. Bluegill appears
Bluegill appearsfrom
from thethe
bottom
bottom right, disappears
right, disappears
quickly down the left, and takes a total of 30 frames. From frame 1 to frame 8 it was recognized as
quickly
quickly
Common down
downthebut
carp, left,
the and
left,
from takes
and
frame a total
takes
9 to of30,
a total
frame 30 frames.
ofit30
wasframes. From
From
recognized frame
frame
as 1 to frame
1 to
Bluegill, not8as
orframe it8any
was
it wasrecognized
In the as as
recognized
object.
Common
Common
case carp, but from
carp, but
that YOLOv2 from
does frame
notframe9 to frame
9 to frame
recognize 30, it was
30,the
the fish, it wasrecognized
recognized
learning as Bluegill,
as Bluegill,
of YOLOv2 or not as any object.
or not as any object.
is not perfect. In In
thethe
case that
case YOLOv2
that YOLOv2 does
doesnotnotrecognize
recognize thethe
fish, thethe
fish, learning of YOLOv2
learning of YOLOv2 is not perfect.
is not perfect.
Figure 10i,j show the classification performances of YOLOv2 and the proposed method, respectively,
for each frame. If the CNN or YOLO network recognizes the images in frame 1 to frame 8 as Common
carp, and fails to recognize the images in frames 11, 12, 15, 16, 17, 25, and 26 as fish, the fish species is
recognized incorrectly. The proposed method has a low probability after frame 21, but is correctly
recognized as Bluegill.
Figure 11 shows one example of four cases where the proposed method did not recognize the
Bluegill. It took a total of 23 frames until the Bluegill appeared from the bottom right and disappeared
down the left. YOLOv2 recognized the Bluegill in frames 5, 7, 8 and 23, but did not recognize any fish
in the other frames. In the proposed method, if fish have not been recognized in the capture region
of the image for a certain period of time, they are not classified as objects. The video image used
in the experiment is very different from the image of Figure 1a, which trained YOLOv2. Therefore,
YOLOv2 is not fully trained, due to the lack of learning images for very fast-moving fish, such as video
images. The proposed method is very simple and intuitive, while retaining the advantages of YOLO in
video images of underwater environments. The heuristic method has shown excellent performance in
classifying and counting objects in video images. Therefore, the proposed method is considered to be
useful not only for objects in the underwater environment, but also for other objects.
(e) (f) (g) (h)

J. Mar. Sci. Eng. 2020, 8, 952 9 of 12

J. Mar. Sci. Eng. 2020, 8, x FOR PEER REVIEW 9 of 12

(i) (j)
Figure 10. Classification performances for video frames: Bluegill (27 %). (a) Frame 1, (b) Frame 5, (c)
Frame 9, (d) Frame 13, (e) Frame 17, (f) Frame 21, (g) Frame 25, (h) Frame 30, (i) YOLOv2, and (j) the
proposed method.

Figure(a)10i,j show the classification (b) performances of YOLOv2 (c) and the proposed (d) method,
respectively, for each frame. If the CNN or YOLO network recognizes the images in frame 1 to frame
8 as Common carp, and fails to recognize the images in frames 11, 12, 15, 16, 17, 25, and 26 as fish, the
fish species is recognized incorrectly. The proposed method has a low probability after frame 21, but
is correctly recognized as Bluegill.
Figure 11 shows one example of four cases where the proposed method did not recognize the
Bluegill. It(e)took a total of 23 frames (f) until the Bluegill appeared (g) from the bottom (h)right and
disappeared down the left. YOLOv2 recognized the Bluegill in frames 5, 7, 8 and 23, but did not
recognize any fish in the other frames. In the proposed method, if fish have not been recognized in
the capture region of the image for a certain period of time, they are not classified as objects. The
video image used in the experiment is very different from the image of Figure 1a, which trained
YOLOv2. Therefore, YOLOv2 is not fully trained, due to the lack of learning images for very
fast-moving fish, such as video images. The proposed method is very simple and intuitive, while
retaining the advantages (i) of YOLO in video images of underwater environments. (j) The heuristic
method
Figure has
Figure10. shown excellent
10.Classification performance
Classification performances
performancesfor in classifying
forvideo
videoframes:
frames: and
Bluegill counting
(27 (27%).
Bluegill objects
%). (a) Frame
(a) Frame in video
1, (b)1,Frame 5,images.
(b) Frame (c) 5,
Therefore,
(c)Frame the proposed
Frame9,9,(d)
(d)Frame
Frame13, method
13,(e) is
(e)Frame considered
Frame17,
17,(f)(f)Frame to
Frame21, be useful
21,(g)(g)Frame
Frame not
25,25,
(h) only
Frame
(h) for
Frame objects
30,30,
(i)(i) in
YOLOv2,
YOLOv2,theandunderwater
(j)(j)
and thethe
environment, but
proposedmethod.
proposed also for other objects.
method.

Figure 10i,j show the classification performances of YOLOv2 and the proposed method,
respectively, for each frame. If the CNN or YOLO network recognizes the images in frame 1 to frame
8 as Common carp, and fails to recognize the images in frames 11, 12, 15, 16, 17, 25, and 26 as fish, the
fish species is recognized incorrectly. The proposed method has a low probability after frame 21, but
is correctly recognized as Bluegill.
Figure(a)11 shows one example of(b)four cases where the proposed (c) (d)
method did not recognize the
Bluegill. It
Figure11.
Figure took a total
11. Classification of
Classification of 23
of YOLOv2frames
YOLOv2for until the
forBluegill Bluegill
Bluegillvideo: Non
video: appeared
detection.
Non detection. from
(a) Frame the bottom
1, (b)1,Frame
(a) Frame right and
7, (c) 7,
(b) Frame
disappeared
(c)Frame
Frame down
18,18,and
and the
(d)(d) left.23.YOLOv2
Frame
Frame 23. recognized the Bluegill in frames 5, 7, 8 and 23, but did not
recognize any fish in the other frames. In the proposed method, if fish have not been recognized in
Figure12
theFigure
capture 12 shows
region
shows the
ofthe results
theresults
image of of
foraaacomparative
comparative
certain period experiment on
of time, they
experiment onrecognition rates
are not classified
recognition rateswith
as
with other
objects.
other deep
The
deep
learning-based
video image methods.
used in the GoogLeNet,
experiment Vgg16,
is very and Vgg19
different
learning-based methods. GoogLeNet, Vgg16, and Vgg19 measured the recognition measured
from the imagethe recognition
of Figure rate
1a, as
which the point
trained in
in
time time
YOLOv2.
when when
the the fish
Therefore,
fish is inis in the
YOLOv2
the center center
isofnotof fully
the the video
video image.
trained,
image. due
In Into
thethe case
the
case of YOLOv2,
oflack thethe
of learning
YOLOv2, recognition
images
recognition rate
forrate
very
was
fast-moving
measured for allfish, such from
frames as video images.
the point whenTheaproposed method isinvery
fish is recognized simpleimage
the video and intuitive,
to the momentwhile it
retaining the advantages of YOLO in video images of underwater
leaves. Furthermore, YOLOv2 and the proposed method used the same learned YOLO network. All environments. The heuristic
method showed
methods has shown a highexcellent performance
recognition in classifying
rate of 0.85 or higher. and
The counting
proposedobjectsmethodinhas video images.
a recognition
Therefore,
rate of 0.95 andthe proposed
other methods method haveis considered
a recognition to be useful
rate not~only
of 0.88 0.89.forInobjects in the underwater
the proposed method, the
environment, but also for other objects.
result of the previous frame affects the recognition result of the current frame. This has a function of
canceling the recognition error in a single frame, and there is a performance improvement of about
0.08 compared to other methods.

(a) (b) (c) (d)

Figure 11. Classification of YOLOv2 for Bluegill video: Non detection. (a) Frame 1, (b) Frame 7, (c)
Frame 18, and (d) Frame 23.

Figure 12 shows the results of a comparative experiment on recognition rates with other deep
learning-based methods. GoogLeNet, Vgg16, and Vgg19 measured the recognition rate as the point
in time when the fish is in the center of the video image. In the case of YOLOv2, the recognition rate
moment it leaves. Furthermore, YOLOv2 and the proposed method used the same learned YOLO
network. All methods showed a high recognition rate of 0.85 or higher. The proposed method has a
recognition rate of 0.95 and other methods have a recognition rate of 0.88 ~ 0.89. In the proposed
method, the result of the previous frame affects the recognition result of the current frame. This has
a function of canceling the recognition error in a single frame, and there is a performance
J. Mar. Sci. Eng. 2020, 8, 952 10 of 12
improvement of about 0.08 compared to other methods.

Figure 12. Experimental results compared with other methods (GoogLeNet, Vgg16, Vgg19, and
Figure 12. Experimental results compared with other methods (GoogLeNet, Vgg16, Vgg19, and
YOLOv2) for recognition rate.
YOLOv2) for recognition rate.
5. Conclusions
5. Conclusions
YOLO shows excellent performance in object recognition, but the performance varies depending
YOLO learning.
on network shows excellent performance
It recognizes objects by in object recognition,
processing images of each but the independently
frame performance of varies
each
depending on network learning. It recognizes objects by processing images of each
other. This means that the classification results in the previous frame do not affect those of the current frame
independently of each other.
frame. By accumulating the This
object means that the classification
classification results
results from the pastinframes
the previous frame do
to the current not
frame,
affect those of
we propose the current
a method frame. By
to accurately accumulating
classify the count
objects, and objecttheir
classification
number in results from thevideo
the sequential past
frames to the current frame, we propose a method to accurately classify objects,
images. The proposed method shows very good classification performance in video images taken in and count their
number in the
underwater sequential video
environments. It hasimages. The proposed
high classification method of
probabilities shows
93.94% veryandgood
97.06%classification
in the test
performance in video images taken in underwater environments. It has high
videos of Bluegill and of Largemouth bass, respectively. The proposed method is also affected by classification
probabilities
the performanceof 93.94 % and
of YOLO, but 97.06 % in the was
its performance test improved
videos ofbyBluegill
applying andtheof Largemouth
heuristic methodbass,
that
respectively.
mimics human The proposed
experience andmethod
learning.is also affected by the performance of YOLO, but its
performance was improved by applying the heuristic method that mimics human experience and
Author Contributions: Conceptualization—J.-H.P.; methodology—J.-H.P.; software—J.-H.P.; validation—J.-H.P.
learning.
and C.K.; writing—original draft preparation—J.-H.P.; writing—review and editing—J.-H.P. and C.K. All authors
Author Contributions:
have read and Conceptualization—J.-H.P.;
agreed to the published version of the manuscript.methodology—J.-H.P.; software—J.-H.P.;
validation—J.-H.P. and C.K.; writing—original draft preparation—J.-H.P.; writing—review and
Funding: This work was supported by Korea Environment Industry & Technology Institute (KEITI) editing—J.-H.P.
through the
and C.K.
Exotic All authors
Invasive have
Species read and agreed
Management to the
Program, published
funded version
by Korea of the
Ministry ofmanuscript.
Environment (MOE) (2017002270002).
Funding:
Conflicts This work was
of Interest: Thesupported by Korea
authors declare Environment
no conflict Industry & Technology Institute (KEITI) through
of interest.
the Exotic Invasive Species Management Program, funded by Korea Ministry of Environment (MOE)
(2017002270002)
References
Conflicts
1. of Interest:
Buckland, The authors
S.T.; Turnock, B.J. declare
A RobustnoLine
conflict of interest.
Transect Method. Biometrics 1992, 48, 901–909. [CrossRef]
2. Järvinen, O.; Väisänen, R.A. Estimating relative densities of breeding birds by the line transect method. Oikos
References
1975, 7, 43–48. [CrossRef]
3. Buckland, S.T.; Garthwaite, P.H. Quantifying Precision of Mark-Recapture Estimates Using the Bootstrap
and Related Methods. Biometrics 1991, 47, 255–268. [CrossRef]
4. Miller, C.R.; Joyce, P.; Waits, L. P. A new method for estimating the size of small populations from genetic
mark-recapture data. Mol. Ecol. 2005, 14, 1991–2005. [CrossRef] [PubMed]
5. Vitkalova, A.V.; Feng, L.; Rybin, A.N.; Gerber, B.D.; Miquelle, D.G.; Wang, T.; Yang, H.; Shevtsova, E.I.;
Aramilev, V.V.; Ge, J. Transboundary cooperation improves endangered species monitoring and conservation
actions: A case study of the global population of Amur leopards. Conserv. Lett. 2018, 11, 12574. [CrossRef]
J. Mar. Sci. Eng. 2020, 8, 952 11 of 12

6. Bischof, R.; Brøseth, H.; Gimenez, O. Wildlife in a Politically Divided World: Insularism Inflates Estimates of
Brown Bear Abundance. Conserv. Lett. 2016, 9, 122–130. [CrossRef]
7. Siddiqui, S.A.; Salman, A.; Malik, M.I.; Shafait, F.; Mian, A.; Shortis, M.S.; Harvey, E.S. Automatic fish species
classification in underwater videos: Exploiting pre-trained deep neural network models to compensate for
limited labelled data. ICES J. Mar. Sci. 2018, 75, 374–389. [CrossRef]
8. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks.
Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [CrossRef]
9. Rekha, B.S.; Srinivasan, G.N.; Reddy, S.K.; Kakwani, D.; Bhattad, N. Fish Detection and Classification Using
Convolutional Neural Networks. In Proceedings of the International Conference on Computational Vision
and Bio Inspired Computing, Coimbatore, India, 25–26 September 2019; pp. 1221–1231.
10. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In
Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015-Conference Track
Proceedings, San Diego, CA, USA, 7–9 May 2015.
11. Park, J.-H.; Choi, Y.-K. Efficient Data Acquisition and CNN Design for Fish Species Classification in Inland
Waters. J. Inform. Commun. Converg. Eng. 2020, 18, 106–114. [CrossRef]
12. Briseño-Avena, C.; Schmid, M.S.; Swieca, K.; Sponaugle, S.; Brodeur, R.D.; Cowen, R.K. Three-dimensional
cross-shelf zooplankton distributions off the Central Oregon Coast during anomalous oceanographic
conditions. Prog. Oceanogr. 2020, 188, 102436. [CrossRef]
13. Swieca, K.; Sponaugle, S.; Briseño-Avena, C.; Schmid, M.S.; Brodeur, R.D.; Cowen, R.K. Changing with the
tides: Fine-scale larval fish prey availability and predation pressure near a tidally modulated river plume.
Mar. Ecol. Prog. Ser. 2020, 650, 217–238. [CrossRef]
14. Schmid, M.S.; Cowen, R.K.; Robinson, K.; Luo, Y.J.; Briseño-Avena, C.; Sponaugle, S. Prey and predator
overlap at the edge of a mesoscale eddy: Fine-scale, in-situ distributions to inform our understanding of
oceanographic processes. Sci. Rep. 2020, 10, 1–16. [CrossRef] [PubMed]
15. Rezende, E.; Ruppert, G.; Carvalho, T.; Theophilo, A.; Ramos, F.; de Geus, P. Malicious Software Classification
Using VGG16 Deep Neural Network’s Bottleneck Features. In Information Technology-New Generations,
Proceedings of the Advances in Intelligent Systems and Computing; Latifi, S., Ed.; Springer: Cham, Switzerland,
2018. [CrossRef]
16. Liu, C.; Cao, Y.; Luo, Y.; Chen, G.; Vokkarane, V.; Ma, Y. Deepfood: Deep learning-based food image
recognition for computer-aided dietary assessment. In Lecture Notes in Computer Science (including subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Proceedings of the Inclusive Smart
Cities and Digital Health. ICOST 2016; Chang, C., Chiari, L., Cao, Y., Jin, H., Mokhtari, M., Aloulou, H., Eds.;
Springer: Cham, Switzerland, 2016; pp. 37–48. [CrossRef]
17. Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-v4, inception-ResNet and the impact of residual
connections on learning. In Proceedings of the 1st AAAI Conference on Artificial Intelligence, AAAI 2017,
San Francisco, CA, USA, 4–9 February 2017; pp. 4278–4284.
18. Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago,
Chile, 7–13 December 2015; pp. 1440–1448. [CrossRef]
19. He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42,
386–397. [CrossRef] [PubMed]
20. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las
Vegas, NV, USA, 30 June 2016; pp. 779–788. [CrossRef]
21. Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the 30th IEEE Conference on
Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 7263–7271.
[CrossRef]
22. Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. In Proceedings of the IEEE Computer
Society Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017;
pp. 6517–6525. [CrossRef]
23. Park, J.-H.; Choi, Y.-K.; Kang, C. Fast Cropping Method for Proper Input Size of Convolutional Neural
Networks in Underwater Photography. J. Soc. Inf. Disp. 2020, 28, 872–881. [CrossRef]
J. Mar. Sci. Eng. 2020, 8, 952 12 of 12

24. Lu, X.; Lin, Z.; Shen, X.; Mech, R.; Wang, J.Z. Deep multi-patch aggregation network for image style,
aesthetics, and quality estimation. In Proceedings of the IEEE International Conference on Computer Vision,
Santiago, Chile, 7–13 December 2015; pp. 990–998. [CrossRef]
25. Kang, L.; Ye, P.; Li, Y.; Doermann, D. Convolutional neural networks for no-reference image quality
assessment. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1733–1740. [CrossRef]
26. Lu, X.; Lin, Z.; Jin, H.; Yang, J.; Wang, J.Z. Rapid: Rating pictorial aesthetics using deep learning. In
Proceedings of the 22nd ACM International Conference on Multimedia (MM’14); Association for Computing
Machinery: New York, NY, USA, 2014; pp. 457–466. [CrossRef]
27. Ma, S.; Liu, J.; Chen, C.W. A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network
for photo aesthetic assessment. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern
Recognition, CVPR 2017, Honolului, HI, USA, 21–26 July 2017; pp. 722–731. [CrossRef]
28. Park, J.-H.; Hwang, K.-B.; Park, H.-M.; Choi, Y.-K. Application of CNN for Fish Species Classification. J.
Korea Inst. Inf. Commun. Eng. 2019, 23, 39–46.
29. Rosenblatt, M. A central limit theorem and a strong mixing condition. Proc. Natl. Acad. Sci. USA 1956, 42, 43.
[CrossRef] [PubMed]
30. Hoeffding, W.; Robbins, H. The central limit theorem for dependent random variables. Duke Math. J. 1948,
15, 773–780. [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional
affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Project
100% (1)
Project
30 pages
"Object Detection With Yolo": A Seminar On
No ratings yet
"Object Detection With Yolo": A Seminar On
14 pages
1-Z-Test For Mean PDF
0% (1)
1-Z-Test For Mean PDF
20 pages
Object Detection Using YOLO
No ratings yet
Object Detection Using YOLO
2 pages
Object Detection Using Yolo Algorithm-1
No ratings yet
Object Detection Using Yolo Algorithm-1
9 pages
Object Detection Using YOLO
No ratings yet
Object Detection Using YOLO
9 pages
You Only Look Once - Object Detection Models A Review
No ratings yet
You Only Look Once - Object Detection Models A Review
8 pages
Confidence Interval Estimation
No ratings yet
Confidence Interval Estimation
62 pages
YOLO V3 ML Project
No ratings yet
YOLO V3 ML Project
15 pages
Optimized Visual Recognition Algorithm in Service Robots: Junwwu, Wei Cai, Shi M Yu, Zhuo L Xu Andxueyhe
No ratings yet
Optimized Visual Recognition Algorithm in Service Robots: Junwwu, Wei Cai, Shi M Yu, Zhuo L Xu Andxueyhe
11 pages
Autonomous Robot Operation
No ratings yet
Autonomous Robot Operation
16 pages
Object Detection With Voice Sensor and Cartoonizing The Image
No ratings yet
Object Detection With Voice Sensor and Cartoonizing The Image
6 pages
Teacher Professional Development For Improving Quality of Teaching
No ratings yet
Teacher Professional Development For Improving Quality of Teaching
270 pages
Intuitive Biostatistics: A Nonmathematical Guide To Statistical Thinking. ISBN 0190643560, 978-0190643560
100% (22)
Intuitive Biostatistics: A Nonmathematical Guide To Statistical Thinking. ISBN 0190643560, 978-0190643560
23 pages
Vehicle Detection and Identification Using YOLO in Image Processing
No ratings yet
Vehicle Detection and Identification Using YOLO in Image Processing
6 pages
Vehicle Detection and Identification Using YOLO in Image Processing
No ratings yet
Vehicle Detection and Identification Using YOLO in Image Processing
6 pages
Senzori Incendiu-5-15
No ratings yet
Senzori Incendiu-5-15
11 pages
8 Id1308 Jte Van Hoang Phuoc Toan 6447
No ratings yet
8 Id1308 Jte Van Hoang Phuoc Toan 6447
7 pages
C11240283S19
No ratings yet
C11240283S19
4 pages
Statistics & Probability Q3 - Week 5-6
No ratings yet
Statistics & Probability Q3 - Week 5-6
16 pages
Sensors 20 02178
No ratings yet
Sensors 20 02178
18 pages
Fire Hotspots Detection System On CCTV Videos Using You Only Look Once (YOLO) Method and Tiny YOLO Model For High Buildings Evacuation
No ratings yet
Fire Hotspots Detection System On CCTV Videos Using You Only Look Once (YOLO) Method and Tiny YOLO Model For High Buildings Evacuation
6 pages
Paper UnderwaterObjectDetectionUsingYOLOV4
No ratings yet
Paper UnderwaterObjectDetectionUsingYOLOV4
8 pages
Real Time Object Detection Using YOLO
No ratings yet
Real Time Object Detection Using YOLO
6 pages
BSC Statistics
100% (1)
BSC Statistics
40 pages
Yolov 3
No ratings yet
Yolov 3
42 pages
IP Report Final
No ratings yet
IP Report Final
20 pages
You Only Look Once v8 For Fish Species Identification
No ratings yet
You Only Look Once v8 For Fish Species Identification
8 pages
Object Detection and Ship Classification Using YOLOv5
No ratings yet
Object Detection and Ship Classification Using YOLOv5
10 pages
Deep Learning Based Automated Billing Cart
No ratings yet
Deep Learning Based Automated Billing Cart
4 pages
Research Paper F
No ratings yet
Research Paper F
12 pages
Comparison Between YOLO and SSD Mobile Net For
No ratings yet
Comparison Between YOLO and SSD Mobile Net For
5 pages
Development Smart Eyeglasses For Visuall
No ratings yet
Development Smart Eyeglasses For Visuall
9 pages
Improved YOLOV7-TINY Network For Sea Bream Detecti
No ratings yet
Improved YOLOV7-TINY Network For Sea Bream Detecti
11 pages
2023 - Comparison of Transfer Learning Techniques For Object Detection
No ratings yet
2023 - Comparison of Transfer Learning Techniques For Object Detection
10 pages
Fish Species Classification in Underwater Video Monitoring Using Convolutional Neural Networks
No ratings yet
Fish Species Classification in Underwater Video Monitoring Using Convolutional Neural Networks
9 pages
A17 6022 WWW
No ratings yet
A17 6022 WWW
6 pages
2022 V13i3059
No ratings yet
2022 V13i3059
11 pages
IJRPR7632
No ratings yet
IJRPR7632
8 pages
Literature Survey
No ratings yet
Literature Survey
4 pages
Survey Adjustment Notes
No ratings yet
Survey Adjustment Notes
7 pages
SEMINAR
No ratings yet
SEMINAR
13 pages
Finalreport
No ratings yet
Finalreport
56 pages
A Review of YOLO Object Detection Algorithms Based
No ratings yet
A Review of YOLO Object Detection Algorithms Based
4 pages
You Only Look Once Model-Based Object Identification in Computer Vision
No ratings yet
You Only Look Once Model-Based Object Identification in Computer Vision
12 pages
Image Detection and Segmentation Using YOLO v5 For
No ratings yet
Image Detection and Segmentation Using YOLO v5 For
6 pages
Live Object Recognition Using YOLO
No ratings yet
Live Object Recognition Using YOLO
5 pages
Yolo Algorithm
No ratings yet
Yolo Algorithm
37 pages
Ruchitha Paper
No ratings yet
Ruchitha Paper
5 pages
Ijramt V3 I5 11
No ratings yet
Ijramt V3 I5 11
3 pages
Object Tracking in Crowd Environment Using Deep Learning
No ratings yet
Object Tracking in Crowd Environment Using Deep Learning
8 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
Image Preprocessing For Efficient Training of YOLO Deep Learning Networks
No ratings yet
Image Preprocessing For Efficient Training of YOLO Deep Learning Networks
3 pages
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
No ratings yet
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
5 pages
Week 7 Estimating Parameter Values
100% (4)
Week 7 Estimating Parameter Values
31 pages
Synopsis - Internship - Group-53
No ratings yet
Synopsis - Internship - Group-53
8 pages
Object Detection and Classification Using Yolov3 IJERTV10IS020078
No ratings yet
Object Detection and Classification Using Yolov3 IJERTV10IS020078
6 pages
Yolo Vs RCNN
No ratings yet
Yolo Vs RCNN
5 pages
Analytical Study On Object Detection Using Yolo Algorithm
No ratings yet
Analytical Study On Object Detection Using Yolo Algorithm
3 pages
Object Detection System With Voice Alert For Blind
No ratings yet
Object Detection System With Voice Alert For Blind
7 pages
YOLO Based Detection and Classification of Objects in Video Records
No ratings yet
YOLO Based Detection and Classification of Objects in Video Records
5 pages
Detection and Content Retrieval of Object in An Image Using YOLO
No ratings yet
Detection and Content Retrieval of Object in An Image Using YOLO
8 pages
RTVB Research
No ratings yet
RTVB Research
4 pages
Ref 14
No ratings yet
Ref 14
5 pages
Yolov10 To Its Genesis A Decadal and Comprehensive
No ratings yet
Yolov10 To Its Genesis A Decadal and Comprehensive
49 pages
Bivariate Analysis
No ratings yet
Bivariate Analysis
46 pages
Design of A Real-Time Object Detection Prototype S
No ratings yet
Design of A Real-Time Object Detection Prototype S
6 pages
4th Quarter Exam Stat and Prob
No ratings yet
4th Quarter Exam Stat and Prob
2 pages
Flipped Notes 7 Estimation
No ratings yet
Flipped Notes 7 Estimation
36 pages
Sample Size Determination
No ratings yet
Sample Size Determination
15 pages
Remotesensing 14 04487 v2
No ratings yet
Remotesensing 14 04487 v2
18 pages
Time Series Analysis Exercises: Universität Potsdam
100% (1)
Time Series Analysis Exercises: Universität Potsdam
30 pages
Introduction To Research Methodology: Research Methodology and Statistics Concepts Dr. Judu Ilavarasu (SVYASA)
No ratings yet
Introduction To Research Methodology: Research Methodology and Statistics Concepts Dr. Judu Ilavarasu (SVYASA)
140 pages
Unit-I 9.4.1.1 Subjective Questions: - Given The Following Probability Function: 2 2
No ratings yet
Unit-I 9.4.1.1 Subjective Questions: - Given The Following Probability Function: 2 2
23 pages
HDS - Form 5 - Technical - Supplement
No ratings yet
HDS - Form 5 - Technical - Supplement
48 pages
DMFit
No ratings yet
DMFit
5 pages
Bootstrap Methods 2020
No ratings yet
Bootstrap Methods 2020
16 pages
统计作业答案
100% (1)
统计作业答案
9 pages
Marginal Effects
No ratings yet
Marginal Effects
16 pages
Spot Speed PDF
No ratings yet
Spot Speed PDF
14 pages
ACST356 Section 4 Complete Notes
No ratings yet
ACST356 Section 4 Complete Notes
29 pages
Ten Guidelines For Better Tables
No ratings yet
Ten Guidelines For Better Tables
28 pages
Lesson 9.2.1 - Note
No ratings yet
Lesson 9.2.1 - Note
8 pages
Fullerton Kinnaman 1996
No ratings yet
Fullerton Kinnaman 1996
15 pages
Grade 12 Chemistry IA Journal
No ratings yet
Grade 12 Chemistry IA Journal
12 pages
August JOLTS Report
No ratings yet
August JOLTS Report
19 pages
Ap Stats CH 10 Notes 2025
No ratings yet
Ap Stats CH 10 Notes 2025
11 pages
Wa Biostatistics Unit 4
No ratings yet
Wa Biostatistics Unit 4
4 pages
Underwater Computer Vision: Exploring the Depths of Computer Vision Beneath the Waves
From Everand
Underwater Computer Vision: Exploring the Depths of Computer Vision Beneath the Waves
Fouad Sabry
No ratings yet

A Study On Enhancement of Fish Recognition

Uploaded by

A Study On Enhancement of Fish Recognition

Uploaded by

Journal of

Keywords: exotic invasive species; object classification; video image; YOLO

J. Mar. Sci. Eng. 2020, 8, 952; doi:10.3390/jmse8110952 www.mdpi.com/journal/jmse

2. YOLO and Learning Data

2.2. Learning Data and Video Image

J. Mar. Sci. Eng. 2020, 8, x FOR PEER REVIEW 3 of 12

(a) (b) (c) (d)

(e) (f) (g)

Figure 2. Learning results.

Figure 3. Underwater photography system.

3.1. Heuristic Method

3.2. Cumulative Mean of The YOLO Network

Figure 4. The flow for cumulative mean of the YOLO network.

Figure 5. Capture lines.

(a) (b) (c) (d)

J. Mar. Sci. Eng. 2020, 8, x FOR PEER REVIEW 8 of 12

Figure 9. The probability of bluegill.

J. Mar. Sci. Eng. 2020, 8, 952 9 of 12

(a) (b) (c) (d)

You might also like