Machine Learning Approach For Bird Detection
Machine Learning Approach For Bird Detection
Abdus Sattar
Department of Computer Science and
Engineering
Daffodil International University
Dhaka, Bangladesh
[email protected]
Abstract—The Artificial Intelligence has been observing a In recent years, Bangladesh is going through an ICT
memorial heightening in bridging the gap between the revolution. The development of Bangladesh in the field of
competence of human being and device. The progress in informat ion and commun ication technology is remarkable.
Computer Vision with Deep Learning has been established and
But the applicat ion of ICT in the field of wild life is very
is becoming better with time, essentially, over one special
algorithm — a Convolutional Neural Network. Here, we use limited. Moreover, Bangladesh is a land of natural beauty
CNN algorithm to recognize the image. Convolutional Neural and b irds h av e add ed t o it s beau ty mo re. Birds eat
Network is a Deep Learning algorithm which can take image insects of agricu lture field and keep safe our harvest. They
as an input, allocate priority in different object to figure & play an important role by keeping balance in our ecology.
capable of differentiating one from the other. Our research Due to being attracted to the urbanization, we are moving
contains 95.52% validation accuracy. This approach obtained away fro m nature and b irds, day by day. Use of d igital
from the experimentation will help to identify a bird from any devices and update technology is still absent in the field o f
picture. wildlife. We need to engage in more ICT applications in this
field, if we want to connect our generation to the wildlife.
Keywords— Machine Learning, Bird detection, CNN, Deep Besides most of our young people do not have much
Learning, Image recognition. knowledge about wild life. They are on a trend of moving
away fro m nature. Hence, they are unable to recognize
I. INTRODUCTION many birds and gather more informat ion about them. But, it
Birds are beautiful creatures of nature and source of joy is important to know about the bird. They see the birds but
and pleasure. Bangladesh is a favorite ho meland of a great do not know the name. Th is paper focuses on the current
variety of birds. Birds are a part of nature. They enrich our status of Bangladesh and will try to make a revolutionary
natural beauty. Bangladesh is a unique abode of birds. So me change using AI and image processing, which will help the
are wild and some are do mes tic. In Bangladesh, there are people especially the kid to know the name of the birds.
about 575 species of migratory b irds. The names of all kinds
of birds are not even known to us. They keep our nature AI is an important term in the field o f ICT in this modern
clean. age. Application of AI will help us to gather knowledge and
In this age of urbanization and technology, we are connect us with wildlife. To give a proper solution, it is
moving away fro m nature day by day. This results in the necessary to find out the problems and related requirements
fact that, we are going to lose our knowledge about wildlife. in this field. It is also necessary to know the government
As a result, our generation does not know much about birds policy or regulations and software industry requirements
in which so me of them cannot even be recognized. To solve along with the course methodologies to implement AI in this
this problem, we make this project by the help of sector. In th is section, we go through a short survey on
Convolutional Neural Network & YoloV3 algorithm. young people and kid to find out the problems.
Bird detection is a great tool for wild life observation. The main questions which focus on the motivation of the
At first, it can serve to filter huge datasets by focus ing on problem formulation in this paper are given below:
sector of interest with bird activity. In general, pre-learned Can you identify the birds that the name of birds
models have two advantages. Firstly, sufficient number o f you know?
pre-learned models are available for free . Secondly, they Do you know the name of the birds that you notice
help in providing instant convergences. However, they are around you?
not applicable for some specific scenarios. In th is project, What are the limitations to work with AI in the
we use bird image as input. We pre-trained the machine sector to detect birds?
with almost 1500 data of nine birds. When we use image as Do the project identify the various birds in one
an input, it will show the name of the bird. It only wo rks on picture?
pre-trained data. By the help of this project, one can easily
know the name of the bird. It can also identify the separate In this part of our research paper, we reveal the Experiment
class of birds from one image. Data Set, Data Pre-processing, architecture of the Model,
Authorized licensed use limited to: KTH Royal Institute of Technology. Downloaded on April 08,2021 at 05:34:05 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4
Rate of learning & the Model of Optimizer, Data can increase their knowledge about birds and would be quite
Augmentation process and Training the Model. interested about it. In winter season, many mig ratory birds
arrive. So, in Bangladesh perspective, this system will play
There are so me benefits of using AI in birds’ detection. an important role.
Some of the technical objectives are given below:
Develop an efficient model to detect birds. III. RESEARCH METHODOLOGY
To inspire the software developers to work with
machine learning using the model. Taking CNN in the co mpany of 9 d issimilar classification
Integrate the model in mobile apps and websites. figures that trade consummate 95.52% precision is the
proposed methodology in the paper.
Other objectives are given below:
Help the young people especially kid to detect
unknown birds.
Help to gather knowledge about birds.
Add a new dimensions in the object detection.
Help to identify the birds easily.
Authorized licensed use limited to: KTH Royal Institute of Technology. Downloaded on April 08,2021 at 05:34:05 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4
CNN model, quad-dimensions were used for the network’s weight in pass to one’s iterations in that after
transformation process. Firstly, crop fu ll figures into iterations [13]. Weights is modernized according’s in l that l
variables of quad dimensions of reduced avoidable object o f iterations indexes.
figures. This is followed by the scaling operation to raise the Teach ing rates perform good to both 2x sizes and 11x
dimensions and resolutions to the required standards. mode sizes like, that mean we don’t depends to models
sizes. Select teaching rates are best mo ment Hyeres
C. Model of the Architecture parameters to that models. That structure for calculat ions in
all modernize in learn rates decays.
Classic model is used to distinguish birds & here 4 Our use small teaching rates reductions 0.001 & turn
convolutional layers followed by two full joined layer are below that learn rates firstly reduces errors. Using in called
utilized. Fully joined layers are incorporated to batch functions in called teaching rates that are bearded in Kera.
normalizat ions. Firstly, CNN layers progress in execution That called functions design in made one refine classics
by establishing a kernel size o f 3 to output images into weights in reduced that teaching rates at the same time
dimensions of 32x32 of RGB color mods with filter sizes of Classic stop improved.
32. Stride and Wadding goods activation functions are used
in the CNN layers. This is fo llo wed by a filter size of 64 E. Data Annotation
kernel size with a stride function value o f one. The structure
of the CNN layer modeling is depicted in figure 4 shown We did annotation our data & check dataset in labeling at
below. the same time the bolt ranges are mo reover matched.
Data annotation is the process of adding metadata to a
dataset. This metadata usually takes the form of tags, which
can be added to any type of data, including text, images, and
video.
Authorized licensed use limited to: KTH Royal Institute of Technology. Downloaded on April 08,2021 at 05:34:05 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4
performance. Our validation data contributing the detect V. RESULT AND ANALYSIS
bird like our train and testing data and also detect the
unknown proposed model given below: There are many successful works in this field but every
analysis has not reported a higher success ratio. There are
many method like CNN, R-CNN, FASTER R-CNN,
YOLOV3 etc. Here , we used convolution neural network
(CNN) based object detection system. We also used the
yolov3 framework. Our analysis gives us higher accuracy
other some detection accuracy because we directly train
object without included any extra data or extra fine-tuning
process. Given below some previous detection an d their
accuracy rate.
VI. CONCLUSION
Here, we simp ly describe convolutional neural network
Figure-8. Describing train and validation accuracy based detection. We detect the object with the help o f
convolution neural network (CNN). We can see from the
observations that; CNN is the best way for detecting the
object. In this paper, we got higher accuracy rate 95.52%,
which is best over other methods in the literature. Here, we
tried our best label to get a better accuracy rate, although, in
spite of having only a nominal amount of data. This type of
Here, present perform of the proposed model and the model
detection is really great work for our country.
of the matrix of confusion data is briefly used. So the given
In future, our method needs to be improved in terms o f
false positive(fp) rate is negative and negative values
computation time. This method could be extended to video
detected as positive, True positive(tp) rate is mainly positive
detection of any kind of object. In near future, we will be
and detected to positive. True negative (tn) rate is detected
aiming our research in terms of thinking, learning any kind
values rightly by the given negative and finally false
of object directly. We should also extend the detection
negative (fn) rate main ly positive values detected as
process to learn to new object classes. Finally, we can also
negatively for every kind of dataset.
improve the algorith m that generates source combinatio ns.
Our future goal is to make an android system process which
Authorized licensed use limited to: KTH Royal Institute of Technology. Downloaded on April 08,2021 at 05:34:05 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4
can automatically detect any kind of b ird. In the Future [6]. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J.,
Zisserman, A.: T he Pascal visual object classes challenge
Period, we are aiming to develop our database in order to
2012 (VOC2012) results (2012). http://www.pascal-
get a good accuracy output result and create an android app network.org/challenges/VOC/voc2011/workshop/index.html
system. Making a good android app system is a challenging [7]. Lin, T .-Y., Dollár, P., Girshick, R., He, K., Hariharan, B.,
process till now. In future, it will be very helpful for people Belongie, S.: Feature pyramid networks for object detection.
In: Proceedings of the IEEE Conference on Computer Vision
to immediately identify the bird. and Pattern Recognition, pp. 2117–2125 (2017)Google
Scholar
REFERENCES [8]. D.R. Hush and J.M. Salas, "improving the learning rate of back-
[1]. K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask r-cnn,” in propagation with the gradient reuse algorithm", IEEE 1988
´ ICCV, 2017. international conference on neural networks, (1988).
[2]. R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature [9]. Najibi, M., Rastegari, M., Davis, L.S.: G-CNN: an iterative
hierarchies for accurate object detection and semantic grid based object detector. In: Proceedings of the IEEE
segmentation,” in CVPR, 2014. Conference on Computer Vision and Pattern Recognition, pp.
[3]. D. Chen, G. Hua, F. Wen, and J. Sun, “Supervised transformer 2369–2377 (2016)Google Scholar
network for efficient face detection,” in ECCV, 2016. [10]. S. Ioffe and C. Szegedy, “Batch normalization: Accelerating
[4]. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via deep network training by reducing internal covariate shift,”
region-based fully convolutional networks. In: Advances in arXiv preprint arXiv: 1502.03167, 2015.
Neural Information Processing Systems, pp. 379–387 (2016 [11]. K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep
[5]. Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable residual networks,” in ECCV, 2016.
object detection using deep neural networks. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 2147–2154 (2014)
Authorized licensed use limited to: KTH Royal Institute of Technology. Downloaded on April 08,2021 at 05:34:05 UTC from IEEE Xplore. Restrictions apply.